Priors for Random Count Matrices with Random or Fixed Row Sums

Size: px
Start display at page:

Download "Priors for Random Count Matrices with Random or Fixed Row Sums"

Transcription

1 Priors for Random Count Matrices with Random or Fixed Row Sums Mingyuan Zhou Joint work with Oscar Madrid and James Scott IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin th Conference on Bayesian Nonparametrics Raleigh, NC, June, / 7

2 Table of Contents Motivations How to construct an infinite random count matrix? Priors for random count matrices Infinite vocabulary naive Bayes classifiers Random count matrices and mixed-membership modeling Conclusions / 7

3 Motivations Where do random count matrices appear? Directly observable random count matrices: Text analysis: document-word count matrix DNA-sequencing: sample-gene count matrix Social network analysis: user-venue check-in count matrix Consumer behavior: consumer-product count matrix Latent random count matrices: Topic models [Blei et al., ]: document-topic count matrix (the sum of each row is the length of the corresponding document) Hidden Markov models: state-state transition count matrix / 7

4 Motivations Motivations to Study Random Count Matrices Lack of priors to describe random count matrices with a potentially infinite number of rows/columns. A naive Bayes classifier often requires a predetermined vocabulary shared across all categories, and has to ignore previously unseen features/terms. How to calculate the predictive distribution of a new count vector that brings previously unseen terms? Interesting combinatorial structures unique to infinite random count matrices. Priors for random count matrices can be used to construct priors for mixed-membership modeling. / 7

5 Frequency Frequency Frequency Frequency Frequency Frequency Frequency Frequency Document Document Document Document Priors for Random Count Matrices with Random or Fixed Row Sums Motivations Representation of a count vector under a count matrix Mac.Hardware Politics.Guns Mac.Hardware Politics.Guns Term A Mac.Hardware document Term Term A Mac.Hardware document Term Term A Politics.Guns document Term Term A Politics.Guns document Term New term New term New term.. New term / 7

6 Motivations Infinite random count matrices to be studied No natural upper bound on the number of rows or columns Conditionally independent rows, i.i.d. columns Parallel column-wise construction Sequential row-wise constructions Predictive distribution of a new row count vector that brings new features Random count matrices with fixed row sums for mixed-membership modeling / 7

7 How to construct an infinite random count matrix? Related prior distributions Prior distributions for counts: Poisson, logarithmic, digamma distributions Negative binomial, beta-negative binomial, and gamma-negative binomial distributions Poisson-logarithmic bivariate distribution [Zhou & Carin, ] Generating a random count vector: Chinese restaurant process, Pitman-Yor process Normalized random measures with independent increments [Regazzini, Lijoi, & Prünster, ; James, Lijoi, & Prünster, 9] Exchangeable partition probability functions (EPPFs) [Pitman, ]; Size dependent EPPFs [Zhou & Walker, ] Generating an infinite random binary matrix: Indian buffet process [Griffiths & Ghahramani, ]; Beta-Bernoulli process [Thibaux & Jordan, 7] Generating an infinite random count matrix: How? 7 / 7

8 How to construct an infinite random count matrix? Steps to construct an infinite random count matrix Choose a completely random measure G, a draw from which consists of countably infinite atoms G = k= r kδ ωk. For X j := k= n jkδ ωk, draw counts n jk f (r k, θ j ), where f denotes a count distribution parameterized by r k and θ j. Denote n :k = (n k,..., n Jk ) T and n k = J j= n jk. The count matrix N J is constructed by organizing all the nonzero column count vectors, {n :k } k:n k >, in an arbitrary order into a random count matrix. In practice, we cannot instantiate all the atoms of G. Thus we will have to marginalize G out from {X j },J to construct N J. / 7

9 Priors for random count matrices Example: gamma-poisson or negative binomial process Gamma-Poisson process [Titsias, ; Zhou & Carin, ; Zhou et al., ] X j PP(G), G ΓP(G, /c) Conditional likelihood: p({x j },J G) = k= r n k k J j= n jk! e Jr k K J = e JG(Ω\D) k= r n k k e Jr k J j= n jk! To marginalize G out, one may separate Ω to the absolution continuous space and points of discontinuity, and then apply the characteristic function to G(Ω\D) and the Lévy measure of G to each point of discontinuity. The {X j },J to N J is a one-to-(k J!) mapping, thus f (N J γ, c) = E G [p({x j },J G)] K J! 9 / 7

10 Priors for random count matrices Example: gamma-poisson or negative binomial process Exchangeable rows and i.i.d. columns Distribution for the count matrix: f (N J γ, c) = γk J exp [ γ ln( J+c c )] K J! Row exchangeable, column i.i.d: K J k= n :k Multinomial(n k, /J,..., /J), n k Log[J/(J + c)], K J Pois {γ [ln(j + c) ln(c)]}. Γ(n k ) (J+c) n k J j= n jk! Closed-form Gibbs sampling update equations for model parameters / 7

11 Priors for random count matrices Example: gamma-poisson or negative binomial process Exchangeable rows and i.i.d. columns Distribution for the count matrix: f (N J γ, c) = γk J exp [ γ ln( J+c c )] K J! Row exchangeable, column i.i.d: K J k= n :k Multinomial(n k, /J,..., /J), n k Log[J/(J + c)], K J Pois {γ [ln(j + c) ln(c)]}. Γ(n k ) (J+c) n k J j= n jk! Closed-form Gibbs sampling update equations for model parameters / 7

12 Priors for random count matrices Example: gamma-poisson or negative binomial process Sequential row-wise construction Sequential row-wise construction: p(n + J+ N J, θ) = f (N J+ θ) f (N J θ) = K J!K + J+! K J+! K J+ k=k J + Log K J k= ( NB n (J+)k ; n k, ( ) n (J+)k ; J + c + ) J + c + Pois { K + J+; γ [ln(j + c + ) ln(j + c)] }. To add a new row to N J Z J K J : First, draw count NB(n k, p J+ ) at each existing column Second, draw K + J+ Pois {γ [ln(j + c + ) ln(j + c)]} number of new columns Third, draw Log(pJ+ ) random count at each new column The combinatorial coefficient arises as the newly added columns are inserted into the original ones at random locations, with their relative orders preserved. / 7

13 Priors for random count matrices Example: gamma-poisson or negative binomial process rows 7 9 columns Figure: A sequentially constructed negative binomial process random count matrix N J NBPM(γ, c). / 7

14 Priors for random count matrices Example: gamma-negative binomial process Gamma-negative binomial process [Zhou & Carin, ; Zhou et al., ] Gamma-negative binomial process: Conditional likelihood: X j NBP(G, p j ), G ΓP(G, /c) p({x j },J G, p) = Augmented likelihood: k= j= K J p({x j, L j },J G, p) = e q G(Ω\D) J Γ(n jk + r k ) n jk!γ(r k ) pn jk j ( p j ) r k k= where q j = ln( p j ) and q = J j= q j. r l k e q r k k ( J j= s(n jk, l jk ) p n jk j n jk! ), / 7

15 Priors for random count matrices Example: gamma-negative binomial process Distribution for the (augmented) count matrix: f (N J, L J θ) = γk J exp [ γ ln( c+q ) ] c K J! Row heterogeneity, column i.i.d.: n jk = l jk t= K J k= n jkt, n jkt Log(p j ), ( J Γ(l k ) (c + q ) l k j= (l k,..., l Jk ) Mult(l k, q /q,..., q J /q ), l k Log[q /(c + q )], K J Pois{γ [ln(c + q ) ln(c)]}. s(n jk, l jk ) p n jk j n jk! Closed-form Gibbs sampling update equations for model parameters. ) / 7

16 Priors for random count matrices Example: gamma-negative binomial process Predictive distribution of a new row: p(n + J+, L+ J+ N J, L J, θ) = K J!K + J+! K J+! K J k= NB (l (J+)k ; l k, K J+ k=k J + Log (l (J+)k ; KJ+ k= SumLog ( l (J+)k, p J+ ) ) q J+ c+q +q J+ ) q J+ c+q +q J+ Pois { K + J+ ; γ [ln(c + q + q J+ ) ln(c + q )] }. To add a new row: q Draw NB(l k, J+ c+q +q J+ ) tables at existing columns (dishes) Draw K + J+ Pois {γ [ln(c + q + q J+ ) ln(c + q )]} new dishes Draw Log( q J+ c+q +q J+ ) tables at each new dish Draw Log(pJ+ ) customers at each table and aggregate the counts across the tables of the same dish as n (J+)k = l (J+)k t= n (J+)kt / 7

17 Priors for random count matrices Example: gamma-negative binomial process rows columns 7 9 Figure: A sequentially constructed gamma-negative binomial process random count matrix N J GNBPM(γ, c, p,, p J ). / 7

18 Priors for random count matrices Example: beta-negative binomial process Beta-negative binomial process Beta-negative binomial process [Zhou et al., ; Broderick et al., ; Zhou & Carin ; Heaukulani & Roy, ; Zhou et al., ]: Conditional likelihood: X j NBP(r j, B), B BP(c, B ) p({x j },J B, r) = e p r where K J k= p n k k ( p k) r p = k=k J + ln( p k) J j= Γ(n jk + r j ) n jk!γ(r j ) 7 / 7

19 Priors for random count matrices Example: beta-negative binomial process Distribution for the count matrix: f (N J γ, c, r) = γk J e γ [ψ(c+r ) ψ(c)] K J! K J k= Row heterogeneity, column i.i.d.: Γ(n k )Γ(c + r ) Γ(c + n k + r ) n :k DirMult(n k, r,, r J ) J j= n k Digam(r, c) K J Pois { γ [ψ(c + r ) ψ(c)] } where Digam(n r, c) = Γ(r+n)Γ(c+r) ψ(c+r) ψ(c) nγ(c+n+r)γ(r) Γ(n jk + r j ) n jk!γ(r j ) Closed-form Gibbs sampling update equations for model parameters / 7

20 Priors for random count matrices Example: beta-negative binomial process Ice cream buffet process (a.k.a., multi-scoop IBP [Zhou et al., ] and negative binomial IBP [Heaukulani & Roy, ]) Sequential row-wise construction: p(n + J+ N J) = K J!K + J+! K J+! KJ k= BNB(n (J+)k; r J+, n k, c + r ) K J+ k=k J + Digam(n (J+)k; r J+, c + r ) Pois { K + J+ ; γ [ψ(c + r + r J+ ) ψ(c + r )] }. To add a new row: Customer J + takes n(j+)k BNB(r J+, n k, c + r ) number of scoops at an existing ice cream (column). The customer further selects K + J+ Pois {γ [ψ(c + r + r J+ ) ψ(c + r )]} new ice creams out of the buffet line. The customer takes n(j+)k Digam(r J+, c + r ) number of scoops at each new ice cream. 9 / 7

21 Priors for random count matrices Example: beta-negative binomial process columns rows 7 9 Figure: A sequentially constructed beta-negative binomial process random count matrix N J BNBPM(γ, c, r,, r J ). / 7

22 Priors for random count matrices Example: beta-negative binomial process Comparison of different priors NBP: Var[n (J+)k ] = E[n (J+)k ] + E [n (J+)k ] n k GNBP: Var[n (J+)k ] = E[n (J+)k] p J+ + E [n (J+)k ] BNBP: Var[n (J+)k ] = E[n (J+)k] c+r n k +c+r l k + E [n (J+)k ] n k (c+r ) n k +c+r / 7

23 Priors for random count matrices Example: beta-negative binomial process columns rows NBP columns rows NBP columns rows NBP columns rows GNBP 7 7 columns rows GNBP 7 7 columns rows GNBP 9 columns rows BNBP 7 columns rows BNBP 7 7 columns rows BNBP / 7

24 Priors for random count matrices Example: beta-negative binomial process Training and posterior predictive checking (a) The observed count matrix (b) A simulated NBP random count matrix Documents Documents 9 Words 9 Words (c) A simulated GNBP random count matrix (d) A simulated BNBP random count matrix Documents Documents 7 Words Words / 7

25 Infinite vocabulary naive Bayes classifiers Predictive distribution of a new row vector The predictive distribution of a row vector n J+ is p(n J+ N J, θ) = p(n+ J+ N J, θ) K + J+! () = K K J+! J! K J!K +!f (N J+ θ) J+. () K J+! f (N J θ) The normalizing constant /K + J+! in () arises because a realization of N + J+ to n J+ is one-to-many, with K + J+! distinct orderings of these new columns. The normalizing constant K J!/K J+! in () arises because there are K + J+ i= (K J + i)! = K J+!/K J! ways to insert the K + J+ new columns into the original ordered K J columns, which is again a one-to-many mapping. / 7

26 Infinite vocabulary naive Bayes classifiers Each category is summarized as a random count matrix N J ; columns with all zeros are excluded. Gibbs sampling is used to infer the parameters θ that generate N J ; to represent the posterior of θ, S MCMC samples {θ [s] },S are collected. For a testing row count vector n J+, its predictive likelihood given N J is calculated via Monte Carlo integration using p(n J+ N J ) = S S p(n + J+ N J, θ [s] ) s= for both the NBP and BNBP, and using p(n J+ N J ) = S for the GNBP. S s= K + J+! p(n + J+ N J, L [s] J, θ[s] ) K + J+! / 7

27 Frequency Frequency Frequency Frequency Frequency Frequency Frequency Frequency Document Document Document Document Priors for Random Count Matrices with Random or Fixed Row Sums Infinite vocabulary naive Bayes classifiers Infinite vocabulary naive Bayes classifiers Mac.Hardware Politics.Guns Mac.Hardware Politics.Guns Term A Mac.Hardware document Term Term A Mac.Hardware document Term Term A Politics.Guns document Term Term A Politics.Guns document Term New term New term New term.. New term / 7

28 Infinite vocabulary naive Bayes classifiers (a) Infinite vocabulary (b) Finite vocabulary Accuracy NBP Multinomial BNBP GNBP Accuracy Ratio of training documents..... Ratio of training documents Figure: Document categorization results on the Newsgroup dataset with (a) an unconstrained vocabulary that can grow to infinite, and (b) an predetermined finite vocabulary of size V =,, using the negative binomial process (NBP), gamma-negative binomial process (GNBP), and beta-negative binomial process (BNBP). The results of the multinomial naive Bayes classifier using Laplace smoothing are included for comparison. 7 / 7

29 Infinite vocabulary naive Bayes classifiers (a) Infinite vocabulary (b) Finite vocabulary Accuracy...7 NBP Multinomial BNBP GNBP Accuracy Ratio of training documents Ratio of training documents Figure: Analogous plots to the plots in the previous Figure for the TDT dataset. The predetermined finite vocabulary has the size of V =,77. / 7

30 Infinite vocabulary naive Bayes classifiers Figure: (a) The predicted probabilities of the test documents under different categories for the CNAE-9 dataset, using the GNBP nonparametric Bayesian naive Bayes classifier with % of the documents of each of the nine categories used for training. (b) Boxplots of the categorization accuracies; each accuracy is computed with S =, S =, S =, or S = MCMC samples. 9 / 7

31 Random count matrices and mixed-membership modeling Beta-negative binomial process (BNBP) mixed-membership modeling Construct EPPFs for mixture modeling using priors for random count vectors [Zhou & Walker, ] One way to generate a random count vector (n,..., n l ): Draw l, the length of the vector, and then draw independent positive random counts {n k },l. Another way to generate such a random count vector: Draw a total count n, and partition it using an EPPF, resulting in a set of exchangeable categorical variables z = (z,..., z n ). Map z to a random positive count vector (n,..., n l ), where n k := n i= δ(z i = k) >. Both ways lead to the same distributed (n,..., n l ) if and only if P(n,..., n l, n) = n! l! l P(z, n) k= n k! (Sample size dependent) EPPF for Mixture modeling: [ ] P(z, n) n! P(n,..., n l, n) P(z n) = = P(n) l! l k= n k! P(n) / 7

32 Random count matrices and mixed-membership modeling Beta-negative binomial process (BNBP) mixed-membership modeling Construct EPPFs for mixed-membership modeling using priors for random count matrices [Zhou ] BNBP random count matrix prior f (N J r, γ, c) = γk J e γ [ψ(c+r ) ψ(c)] KJ Γ(n k )Γ(c+r ) J Γ(n jk +r j ) K J! k= Γ(c+n k +r ) j= n jk!γ(r j ) With z = (z,..., z JmJ ) and n jk = m j i= δ(z ji = k), the joint distribution of a column count vector m = (m,..., m J ) T and its partition into a column exchangeable latent random count matrix with K J nonempty columns can be expressed as f (z, m r, γ, c) = K J! = γk J e γ[ψ(c+r ) ψ(c)] J j= m j! J j= K J k= m j! KJ k= n jk! Γ(n k)γ(c + r ) Γ(c + n k + r ) f (N J r, γ, c) J j= Γ(n jk + r j ) Γ(r j ) / 7

33 Random count matrices and mixed-membership modeling Beta-negative binomial process (BNBP) mixed-membership modeling The BNBP s EPPF for mixed-membership modeling: f (z m, r, γ, c) = f (z, m r, γ, c) f (m r, γ, c) The prediction rule is simple: = K J! J j= m j! KJ k= n jk! P(z ji z ji f (z ji, z ji, m r, γ, c), m, r, γ, c) = K ji. J + k= f (z ji = k, z ji, m r, γ, c) n ji k γ r j, c + r c + n ji k + r f (N J r, γ, c) f (m r, γ, c) (n ji jk + r j ), for k =,, K ji J ; if k = K ji J +. / 7

34 Random count matrices and mixed-membership modeling Beta-negative binomial process (BNBP) mixed-membership modeling Random count matrices with fixed row sums (a) r i = (b) r i = (c) r i = Group Group Group Partition Partition Partition Figure: Random draws from the EPPF that governs the BNBP s exchangeable random partitions of groups (rows), each of which has data points. The jth row of each matrix, which sums to, represents the partition of the m j = data points of the jth group over a random number of exchangeable clusters. The kth column of each matrix represents the kth nonempty cluster in order of appearance in Gibbs sampling (the empty clusters are deleted). / 7

35 Random count matrices and mixed-membership modeling Gamma-negative binomial process (GNBP) mixed-membership modeling The GNBP s EPPF for mixed-membership modeling GNBP random count matrix prior f (N J, L J γ, c, p) = γk J exp[ γ ln( c+q c )] KJ K J! k= ( Γ(l k ) J (c+q ) l k j= s(n jk,l jk ) p n jk j n jk! With z = (z,..., z JmJ ), b = (b,..., b JmJ ), and n jkt = m j i= δ(z ji = k, b ji = t), the joint distribution of a column count vector m = (m,..., m J ) T, its partition into a column exchangeable latent random count matrix with K J nonempty columns, and an auxiliary categorical random vector can be expressed as ) f (b, z, m γ, c, p) = γ K J J p m j K J j Γ(l k) m j! (c + q ) l k j= e γ ln( c+q c ) k= J l jk j= t= Γ(n jkt ) / 7

36 Random count matrices and mixed-membership modeling Gamma-negative binomial process (GNBP) mixed-membership modeling The GNBP s EPPF for mixed-membership modeling: The prediction rule is simple: f (z, b m, γ, c, p) = f (z, b, m γ, c, p) f (m γ, c, p) P(z ji = k, b ji = t b ji, z ji, m, p, c) = f (z ji = k, b ji = t, b ji, z ji, m p, c) z ji,b ji f (z ji, b ji, b ji, z ji, m p, c) n ji jkt, ji if k KJ l ji k /(c + q ), if k K ji J γ /(c + q ),, t l ji jk ;, t = l ji jk + ; if k = K ji J +, t =. If we let z ji be the dish index and b ji be the table index for customer i in restaurant j, then the collapsed Gibbs sampler can be related to the Chinese restaurant franchise sampler of the hierarchical Dirichlet process (Teh et al., ). / 7

37 Conclusions Conclusions A family of probability mass functions for random count matrices. The proposed random count matrices have a random number of i.i.d. columns and could also be constructed by adding one row at a time. Their parameters can be inferred with closed-form Gibbs sampling update equations. Infinite vocabulary naive Bayes classifiers. Priors for random count matrices can be used to construct (group size dependent) EPPFs for mixed-membership modeling, with simple prediction rules for collapsed Gibbs sampling. / 7

38 Conclusions Main References M. Zhou, O. H. M. Padilla and J. G. Scott. Priors for random count matrices derived from a family of negative binomial processes. arxiv:.,. M. Zhou. Beta-negative binomial process and exchangeable random partitions for mixed-membership modeling. NIPS,. M. Zhou and S. G. Walker. Sample size dependent species models. arxiv:.,. C. Heaukulani and D. M. Roy. The combinatorial structure of beta negative binomial processes. arxiv:.,. T. Broderick, L. Mackey, J. Paisley, and M. I. Jordan. Combinatorial clustering and the beta negative binomial process. IEEE Trans. Pattern Analysis and Machine Intelligence,. M. Zhou and L. Carin. Negative binomial process count and mixture modeling. IEEE Trans. Pattern Analysis and Machine Intelligence,. M. Zhou and L. Carin. Augment-and-conquer negative binomial processes. In NIPS,. M. Zhou, L. Hannah, D. Dunson, and L. Carin. Beta-negative binomial process and Poisson factor analysis. In AISTATS,. 7 / 7

Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes: Supplementary Material

Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes: Supplementary Material Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes: Supplementary Material A The Negative Binomial Process: Details A. Negative binomial process random count matrix To

More information

Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling

Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling Mingyuan Zhou IROM Department, McCombs School of Business The University of Texas at Austin, Austin, TX 77,

More information

Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling

Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling Beta-Negative Binomial Process and Exchangeable Random Partitions for Mixed-Membership Modeling Mingyuan Zhou IROM Department, McCombs School of Business The University of Texas at Austin, Austin, TX 787,

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models Indian Buffet process, beta process, and related models François Caron Department of Statistics, Oxford Applied Bayesian Statistics Summer School Como, Italy

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Bayesian non parametric approaches: an introduction

Bayesian non parametric approaches: an introduction Introduction Latent class models Latent feature models Conclusion & Perspectives Bayesian non parametric approaches: an introduction Pierre CHAINAIS Bordeaux - nov. 2012 Trajectory 1 Bayesian non parametric

More information

Infinite latent feature models and the Indian Buffet Process

Infinite latent feature models and the Indian Buffet Process p.1 Infinite latent feature models and the Indian Buffet Process Tom Griffiths Cognitive and Linguistic Sciences Brown University Joint work with Zoubin Ghahramani p.2 Beyond latent classes Unsupervised

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

Augment-and-Conquer Negative Binomial Processes

Augment-and-Conquer Negative Binomial Processes Augment-and-Conquer Negative Binomial Processes Mingyuan Zhou Dept. of Electrical and Computer Engineering Duke University, Durham, NC 27708 mz@ee.duke.edu Lawrence Carin Dept. of Electrical and Computer

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Haupthseminar: Machine Learning. Chinese Restaurant Process, Indian Buffet Process

Haupthseminar: Machine Learning. Chinese Restaurant Process, Indian Buffet Process Haupthseminar: Machine Learning Chinese Restaurant Process, Indian Buffet Process Agenda Motivation Chinese Restaurant Process- CRP Dirichlet Process Interlude on CRP Infinite and CRP mixture model Estimation

More information

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process 10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

Dependent hierarchical processes for multi armed bandits

Dependent hierarchical processes for multi armed bandits Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino

More information

Machine Learning Summer School, Austin, TX January 08, 2015

Machine Learning Summer School, Austin, TX January 08, 2015 Parametric Department of Information, Risk, and Operations Management Department of Statistics and Data Sciences The University of Texas at Austin Machine Learning Summer School, Austin, TX January 08,

More information

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Lancelot F. James (paper from arxiv [math.st], Dec 14) Discussion by: Piyush Rai January 23, 2015 Lancelot F. James () Poisson Latent

More information

On collapsed representation of hierarchical Completely Random Measures

On collapsed representation of hierarchical Completely Random Measures Gaurav Pandey Ambedkar Dukkipati Department of Computer Science and Automation Indian Institute of Science, Bangalore-560012, India GP88@CSA.IISC.ERNET.IN AD@CSA.IISC.ERNET.IN In this paper, it is our

More information

Bayesian Nonparametric Models on Decomposable Graphs

Bayesian Nonparametric Models on Decomposable Graphs Bayesian Nonparametric Models on Decomposable Graphs François Caron INRIA Bordeaux Sud Ouest Institut de Mathématiques de Bordeaux University of Bordeaux, France francois.caron@inria.fr Arnaud Doucet Departments

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models François Caron UBC October 2, 2007 / MLRG François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 1 / 29 Overview 1 Introduction

More information

Negative Binomial Process Count and Mixture Modeling

Negative Binomial Process Count and Mixture Modeling Negative Binomial Process Count and Mixture Modeling Mingyuan Zhou and Lawrence Carin Abstract The seemingly disjoint problems of count and mixture modeling are united under the negative binomial NB process.

More information

Feature Allocations, Probability Functions, and Paintboxes

Feature Allocations, Probability Functions, and Paintboxes Feature Allocations, Probability Functions, and Paintboxes Tamara Broderick, Jim Pitman, Michael I. Jordan Abstract The problem of inferring a clustering of a data set has been the subject of much research

More information

The Indian Buffet Process: An Introduction and Review

The Indian Buffet Process: An Introduction and Review Journal of Machine Learning Research 12 (2011) 1185-1224 Submitted 3/10; Revised 3/11; Published 4/11 The Indian Buffet Process: An Introduction and Review Thomas L. Griffiths Department of Psychology

More information

Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5)

Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Nonparametric Bayesian Methods: Models, Algorithms, and Applications (Day 5) Tamara Broderick ITT Career Development Assistant Professor Electrical Engineering & Computer Science MIT Bayes Foundations

More information

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 Permuted and IROM Department, McCombs School of Business The University of Texas at Austin 39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 1 / 36 Joint work

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Infinite Latent Feature Models and the Indian Buffet Process

Infinite Latent Feature Models and the Indian Buffet Process Infinite Latent Feature Models and the Indian Buffet Process Thomas L. Griffiths Cognitive and Linguistic Sciences Brown University, Providence RI 292 tom griffiths@brown.edu Zoubin Ghahramani Gatsby Computational

More information

MAD-Bayes: MAP-based Asymptotic Derivations from Bayes

MAD-Bayes: MAP-based Asymptotic Derivations from Bayes MAD-Bayes: MAP-based Asymptotic Derivations from Bayes Tamara Broderick Brian Kulis Michael I. Jordan Cat Clusters Mouse clusters Dog 1 Cat Clusters Dog Mouse Lizard Sheep Picture 1 Picture 2 Picture 3

More information

Bayesian Nonparametric Learning of Complex Dynamical Phenomena

Bayesian Nonparametric Learning of Complex Dynamical Phenomena Duke University Department of Statistical Science Bayesian Nonparametric Learning of Complex Dynamical Phenomena Emily Fox Joint work with Erik Sudderth (Brown University), Michael Jordan (UC Berkeley),

More information

Image segmentation combining Markov Random Fields and Dirichlet Processes

Image segmentation combining Markov Random Fields and Dirichlet Processes Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO

More information

A marginal sampler for σ-stable Poisson-Kingman mixture models

A marginal sampler for σ-stable Poisson-Kingman mixture models A marginal sampler for σ-stable Poisson-Kingman mixture models joint work with Yee Whye Teh and Stefano Favaro María Lomelí Gatsby Unit, University College London Talk at the BNP 10 Raleigh, North Carolina

More information

Hierarchical Models, Nested Models and Completely Random Measures

Hierarchical Models, Nested Models and Completely Random Measures See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/238729763 Hierarchical Models, Nested Models and Completely Random Measures Article March 2012

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Nonparametric Bayesian Matrix Factorization for Assortative Networks

Nonparametric Bayesian Matrix Factorization for Assortative Networks Nonparametric Bayesian Matrix Factorization for Assortative Networks Mingyuan Zhou IROM Department, McCombs School of Business Department of Statistics and Data Sciences The University of Texas at Austin

More information

An Infinite Product of Sparse Chinese Restaurant Processes

An Infinite Product of Sparse Chinese Restaurant Processes An Infinite Product of Sparse Chinese Restaurant Processes Yarin Gal Tomoharu Iwata Zoubin Ghahramani yg279@cam.ac.uk CRP quick recap The Chinese restaurant process (CRP) Distribution over partitions of

More information

On the posterior structure of NRMI

On the posterior structure of NRMI On the posterior structure of NRMI Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with L.F. James and A. Lijoi Isaac Newton Institute, BNR Programme, 8th August 2007 Outline

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

Nonparametric Factor Analysis with Beta Process Priors

Nonparametric Factor Analysis with Beta Process Priors Nonparametric Factor Analysis with Beta Process Priors John Paisley Lawrence Carin Department of Electrical & Computer Engineering Duke University, Durham, NC 7708 jwp4@ee.duke.edu lcarin@ee.duke.edu Abstract

More information

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Annalisa Cerquetti Collegio Nuovo, University of Pavia, Italy Interview Day, CETL Lectureship in Statistics,

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Applied Nonparametric Bayes

Applied Nonparametric Bayes Applied Nonparametric Bayes Michael I. Jordan Department of Electrical Engineering and Computer Science Department of Statistics University of California, Berkeley http://www.cs.berkeley.edu/ jordan Acknowledgments:

More information

Distance dependent Chinese restaurant processes

Distance dependent Chinese restaurant processes David M. Blei Department of Computer Science, Princeton University 35 Olden St., Princeton, NJ 08540 Peter Frazier Department of Operations Research and Information Engineering, Cornell University 232

More information

Truncation error of a superposed gamma process in a decreasing order representation

Truncation error of a superposed gamma process in a decreasing order representation Truncation error of a superposed gamma process in a decreasing order representation B julyan.arbel@inria.fr Í www.julyanarbel.com Inria, Mistis, Grenoble, France Joint work with Igor Pru nster (Bocconi

More information

Part IV: Monte Carlo and nonparametric Bayes

Part IV: Monte Carlo and nonparametric Bayes Part IV: Monte Carlo and nonparametric Bayes Outline Monte Carlo methods Nonparametric Bayesian models Outline Monte Carlo methods Nonparametric Bayesian models The Monte Carlo principle The expectation

More information

arxiv: v2 [stat.ml] 10 Sep 2012

arxiv: v2 [stat.ml] 10 Sep 2012 Distance Dependent Infinite Latent Feature Models arxiv:1110.5454v2 [stat.ml] 10 Sep 2012 Samuel J. Gershman 1, Peter I. Frazier 2 and David M. Blei 3 1 Department of Psychology and Princeton Neuroscience

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron INRIA IMB - University of Bordeaux Talence, France Francois.Caron@inria.fr Abstract We develop a novel Bayesian nonparametric model for

More information

A Stick-Breaking Construction of the Beta Process

A Stick-Breaking Construction of the Beta Process John Paisley 1 jwp4@ee.duke.edu Aimee Zaas 2 aimee.zaas@duke.edu Christopher W. Woods 2 woods004@mc.duke.edu Geoffrey S. Ginsburg 2 ginsb005@duke.edu Lawrence Carin 1 lcarin@ee.duke.edu 1 Department of

More information

arxiv: v1 [stat.ml] 20 Nov 2012

arxiv: v1 [stat.ml] 20 Nov 2012 A survey of non-exchangeable priors for Bayesian nonparametric models arxiv:1211.4798v1 [stat.ml] 20 Nov 2012 Nicholas J. Foti 1 and Sinead Williamson 2 1 Department of Computer Science, Dartmouth College

More information

Gentle Introduction to Infinite Gaussian Mixture Modeling

Gentle Introduction to Infinite Gaussian Mixture Modeling Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

Combinatorial Clustering and the Beta. Negative Binomial Process

Combinatorial Clustering and the Beta. Negative Binomial Process Combinatorial Clustering and the Beta 1 Negative Binomial Process Tamara Broderick, Lester Mackey, John Paisley, Michael I. Jordan Abstract arxiv:1111.1802v5 [stat.me] 10 Jun 2013 We develop a Bayesian

More information

Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes

Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes Sharing Clusters Among Related Groups: Hierarchical Dirichlet Processes Yee Whye Teh (1), Michael I. Jordan (1,2), Matthew J. Beal (3) and David M. Blei (1) (1) Computer Science Div., (2) Dept. of Statistics

More information

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up Much of this material is adapted from Blei 2003. Many of the images were taken from the Internet February 20, 2014 Suppose we have a large number of books. Each is about several unknown topics. How can

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Nonparametric Bayesian Models --Learning/Reasoning in Open Possible Worlds Eric Xing Lecture 7, August 4, 2009 Reading: Eric Xing Eric Xing @ CMU, 2006-2009 Clustering Eric Xing

More information

Spatial Normalized Gamma Process

Spatial Normalized Gamma Process Spatial Normalized Gamma Process Vinayak Rao Yee Whye Teh Presented at NIPS 2009 Discussion and Slides by Eric Wang June 23, 2010 Outline Introduction Motivation The Gamma Process Spatial Normalized Gamma

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language

More information

arxiv: v2 [stat.ml] 4 Aug 2011

arxiv: v2 [stat.ml] 4 Aug 2011 A Tutorial on Bayesian Nonparametric Models Samuel J. Gershman 1 and David M. Blei 2 1 Department of Psychology and Neuroscience Institute, Princeton University 2 Department of Computer Science, Princeton

More information

Parallel Markov Chain Monte Carlo for Pitman-Yor Mixture Models

Parallel Markov Chain Monte Carlo for Pitman-Yor Mixture Models Parallel Markov Chain Monte Carlo for Pitman-Yor Mixture Models Avinava Dubey School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Sinead A. Williamson McCombs School of Business University

More information

Collapsed Variational Dirichlet Process Mixture Models

Collapsed Variational Dirichlet Process Mixture Models Collapsed Variational Dirichlet Process Mixture Models Kenichi Kurihara Dept. of Computer Science Tokyo Institute of Technology, Japan kurihara@mi.cs.titech.ac.jp Max Welling Dept. of Computer Science

More information

CSCI 5822 Probabilistic Model of Human and Machine Learning. Mike Mozer University of Colorado

CSCI 5822 Probabilistic Model of Human and Machine Learning. Mike Mozer University of Colorado CSCI 5822 Probabilistic Model of Human and Machine Learning Mike Mozer University of Colorado Topics Language modeling Hierarchical processes Pitman-Yor processes Based on work of Teh (2006), A hierarchical

More information

Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process

Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang Computer Science Department Princeton University chongw@cs.princeton.edu David M. Blei Computer Science Department

More information

Scalable Deep Poisson Factor Analysis for Topic Modeling: Supplementary Material

Scalable Deep Poisson Factor Analysis for Topic Modeling: Supplementary Material : Supplementary Material Zhe Gan ZHEGAN@DUKEEDU Changyou Chen CHANGYOUCHEN@DUKEEDU Ricardo Henao RICARDOHENAO@DUKEEDU David Carlson DAVIDCARLSON@DUKEEDU Lawrence Carin LCARIN@DUKEEDU Department of Electrical

More information

The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling

The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling The IBP Compound Dirichlet Process and its Application to Focused Topic Modeling Sinead Williamson SAW56@CAM.AC.UK Department of Engineering, University of Cambridge, Trumpington Street, Cambridge, UK

More information

Bayesian Nonparametrics: Models Based on the Dirichlet Process

Bayesian Nonparametrics: Models Based on the Dirichlet Process Bayesian Nonparametrics: Models Based on the Dirichlet Process Alessandro Panella Department of Computer Science University of Illinois at Chicago Machine Learning Seminar Series February 18, 2013 Alessandro

More information

Bayesian Methods: Naïve Bayes

Bayesian Methods: Naïve Bayes Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior

More information

Stochastic Variational Inference for the HDP-HMM

Stochastic Variational Inference for the HDP-HMM Stochastic Variational Inference for the HDP-HMM Aonan Zhang San Gultekin John Paisley Department of Electrical Engineering & Data Science Institute Columbia University, New York, NY Abstract We derive

More information

Bayesian Mixtures of Bernoulli Distributions

Bayesian Mixtures of Bernoulli Distributions Bayesian Mixtures of Bernoulli Distributions Laurens van der Maaten Department of Computer Science and Engineering University of California, San Diego Introduction The mixture of Bernoulli distributions

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Online Bayesian Passive-Agressive Learning

Online Bayesian Passive-Agressive Learning Online Bayesian Passive-Agressive Learning International Conference on Machine Learning, 2014 Tianlin Shi Jun Zhu Tsinghua University, China 21 August 2015 Presented by: Kyle Ulrich Introduction Online

More information

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks

Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks Creighton Heaukulani and Zoubin Ghahramani University of Cambridge TU Denmark, June 2013 1 A Network Dynamic network data

More information

Dirichlet Enhanced Latent Semantic Analysis

Dirichlet Enhanced Latent Semantic Analysis Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Tree-Based Inference for Dirichlet Process Mixtures

Tree-Based Inference for Dirichlet Process Mixtures Yang Xu Machine Learning Department School of Computer Science Carnegie Mellon University Pittsburgh, USA Katherine A. Heller Department of Engineering University of Cambridge Cambridge, UK Zoubin Ghahramani

More information

Chapter 8 PROBABILISTIC MODELS FOR TEXT MINING. Yizhou Sun Department of Computer Science University of Illinois at Urbana-Champaign

Chapter 8 PROBABILISTIC MODELS FOR TEXT MINING. Yizhou Sun Department of Computer Science University of Illinois at Urbana-Champaign Chapter 8 PROBABILISTIC MODELS FOR TEXT MINING Yizhou Sun Department of Computer Science University of Illinois at Urbana-Champaign sun22@illinois.edu Hongbo Deng Department of Computer Science University

More information

Bayesian Nonparametric Models for Ranking Data

Bayesian Nonparametric Models for Ranking Data Bayesian Nonparametric Models for Ranking Data François Caron 1, Yee Whye Teh 1 and Brendan Murphy 2 1 Dept of Statistics, University of Oxford, UK 2 School of Mathematical Sciences, University College

More information

Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models

Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models Applied Bayesian Nonparametrics 3. Infinite Hidden Markov Models Tutorial at CVPR 2012 Erik Sudderth Brown University Work by E. Fox, E. Sudderth, M. Jordan, & A. Willsky AOAS 2011: A Sticky HDP-HMM with

More information

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University

More information

Dependent Random Measures and Prediction

Dependent Random Measures and Prediction Dependent Random Measures and Prediction Igor Prünster University of Torino & Collegio Carlo Alberto 10th Bayesian Nonparametric Conference Raleigh, June 26, 2015 Joint wor with: Federico Camerlenghi,

More information

Mixed Membership Models for Time Series

Mixed Membership Models for Time Series 20 Mixed Membership Models for Time Series Emily B. Fox Department of Statistics, University of Washington, Seattle, WA 98195, USA Michael I. Jordan Computer Science Division and Department of Statistics,

More information

Dirichlet Processes and other non-parametric Bayesian models

Dirichlet Processes and other non-parametric Bayesian models Dirichlet Processes and other non-parametric Bayesian models Zoubin Ghahramani http://learning.eng.cam.ac.uk/zoubin/ zoubin@cs.cmu.edu Statistical Machine Learning CMU 10-702 / 36-702 Spring 2008 Model

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Stick-Breaking Beta Processes and the Poisson Process

Stick-Breaking Beta Processes and the Poisson Process Stic-Breaing Beta Processes and the Poisson Process John Paisley David M. Blei 3 Michael I. Jordan,2 Department of EECS, 2 Department of Statistics, UC Bereley 3 Computer Science Department, Princeton

More information

Hierarchical Dirichlet Processes

Hierarchical Dirichlet Processes Hierarchical Dirichlet Processes Yee Whye Teh, Michael I. Jordan, Matthew J. Beal and David M. Blei Computer Science Div., Dept. of Statistics Dept. of Computer Science University of California at Berkeley

More information

Nonparametric Probabilistic Modelling

Nonparametric Probabilistic Modelling Nonparametric Probabilistic Modelling Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Signal processing and inference

More information

Nonparametric Bayes Pachinko Allocation

Nonparametric Bayes Pachinko Allocation LI ET AL. 243 Nonparametric Bayes achinko Allocation Wei Li Department of Computer Science University of Massachusetts Amherst MA 01003 David Blei Computer Science Department rinceton University rinceton

More information

arxiv: v1 [stat.ml] 8 Jan 2012

arxiv: v1 [stat.ml] 8 Jan 2012 A Split-Merge MCMC Algorithm for the Hierarchical Dirichlet Process Chong Wang David M. Blei arxiv:1201.1657v1 [stat.ml] 8 Jan 2012 Received: date / Accepted: date Abstract The hierarchical Dirichlet process

More information

Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources

Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical

More information

Beta processes, stick-breaking, and power laws

Beta processes, stick-breaking, and power laws Beta processes, stick-breaking, and power laws T. Broderick, M. Jordan, J. Pitman Presented by Jixiong Wang & J. Li November 17, 2011 DP vs. BP Dirichlet Process Beta Process DP vs. BP Dirichlet Process

More information

Topic Modelling and Latent Dirichlet Allocation

Topic Modelling and Latent Dirichlet Allocation Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer

More information

Latent Dirichlet Bayesian Co-Clustering

Latent Dirichlet Bayesian Co-Clustering Latent Dirichlet Bayesian Co-Clustering Pu Wang 1, Carlotta Domeniconi 1, and athryn Blackmond Laskey 1 Department of Computer Science Department of Systems Engineering and Operations Research George Mason

More information

Nonparametric Bayesian Models for Sparse Matrices and Covariances

Nonparametric Bayesian Models for Sparse Matrices and Covariances Nonparametric Bayesian Models for Sparse Matrices and Covariances Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Bayes

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Two Useful Bounds for Variational Inference

Two Useful Bounds for Variational Inference Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

Sparse Stochastic Inference for Latent Dirichlet Allocation

Sparse Stochastic Inference for Latent Dirichlet Allocation Sparse Stochastic Inference for Latent Dirichlet Allocation David Mimno 1, Matthew D. Hoffman 2, David M. Blei 1 1 Dept. of Computer Science, Princeton U. 2 Dept. of Statistics, Columbia U. Presentation

More information