Bayesian nonparametric latent feature models François Caron UBC October 2, 2007 / MLRG François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 1 / 29
Overview 1 Introduction 2 Dirichlet Process Finite mixture model Equivalent classes Dirichlet Process model Chinese Restaurant Stick-breaking construction 3 Nonparametric latent feature models Finite feature model Equivalent classes Nonparametric latent feature model Indian buffet Stick-breaking construction 4 Applications François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 2 / 29
Overview 1 Introduction 2 Dirichlet Process Finite mixture model Equivalent classes Dirichlet Process model Chinese Restaurant Stick-breaking construction 3 Nonparametric latent feature models Finite feature model Equivalent classes Nonparametric latent feature model Indian buffet Stick-breaking construction 4 Applications François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 3 / 29
Finite mixture model n objects and K clusters Each object k = 1,..., n can belong to only one cluster j {1,..., K} Represented by an allocation variable c k {1,..., K} François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 4 / 29
Finite mixture model Bayesian approach: Dirichlet-multinomial model π 1:K D( α K,..., α K ) and for k = 1,..., n, K : Dirichlet Process c k π 1:K François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 5 / 29
Finite latent feature model n objects and K features Each object k can have a finite number of latent features Represented by a binary vector c k {0, 1} K François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 6 / 29
Finite latent feature model Bayesian approach: Prior distribution over a binary matrix of size n K K : Nonparametric latent feature model François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 7 / 29
Overview 1 Introduction 2 Dirichlet Process Finite mixture model Equivalent classes Dirichlet Process model Chinese Restaurant Stick-breaking construction 3 Nonparametric latent feature models Finite feature model Equivalent classes Nonparametric latent feature model Indian buffet Stick-breaking construction 4 Applications François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 8 / 29
Finite mixture model Hierarchical model For j = 1,..., K, For k = 1,..., n, π 1:K D( α K,..., α K ) U j G 0 c k π 1:K z k c k, U 1:K f ( U ck ) François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 9 / 29
Finite mixture model Prior distribution Pr(π 1:K, U 1:K, c 1:n ) = Pr(π 1:K, c 1:n ) Pr(U 1:K ) Can integrate out analytically π 1:K (Dirichlet-multinomial distribution) Pr(c 1:n ) = Pr(π 1:K, c 1:n )dπ 1:K = Γ(α) K j=1 Γ(n j + α K ) Γ(α + n)γ( α K )K François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 10 / 29
Distribution over partitions Let Π n = {A 1,..., A n(πn)} be a random partition of {1,..., n} and P n the set of partitions of {1,..., n} Ex: Π 5 = {{1, 2, 5}, {3}, {4}} Several different values of c 1:n may induce the same partition Ex: c 1:5 = (1, 1, 2, 1, 3) and c 1:5 = (2, 2, 3, 2, 1) both induce the same partition Π 5 = {{1, 2, 4}, {3}, {5}} Let Π n (c 1:n ) be the partition of {1,..., n} induced by the equivalence relationship i j c i = c j François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 11 / 29
Distribution over partitions We have Pr(Π n (c 1:n )) = K! (K n(π n ))! Pr(c 1:n) for all Π n P K where P K = {Π n P n n(π n ) K} and thus for the Dirichlet-multinomial distribution Pr(Π n ) = K! Γ(α) K j=1 Γ(n j + α K ) (K n(π n ))! Γ(α + n)γ( α K )K François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 12 / 29
Dirichlet Process Limit when K of the finite model Pr(Π n ) = K! Γ(α) K j=1 Γ(n j + α K ) (K n(π n ))! Γ(α + n)γ( α K )K is given by Pr(Π n ) = αn(πn) n(π n) j=1 Γ(n j ) (α + i 1) n i=1 François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 13 / 29
Dirichlet Process Connexion to the usual formulation for k = 1,..., n G DP(α, G 0 ) θ k G G Let Π n (θ 1...., θ n ) be the partition of {1,..., n} induced by the equivalence relationship i j θ i = θ j and U 1,..., U n(πn) be the different values taken by θ 1,..., θ n When integrating out the unknown distribution G n(π n) Pr(θ 1,..., θ n ) = Pr(Π n (θ 1...., θ n )) G 0 (U j ) j=1 François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 14 / 29
Chinese Restaurant Let Π k be the partition obtained by removing item k from Π n Conditional association probability of item k is Pr(Π n ) Pr(Π k ) Item k is associated to an existing cluster j = 1,..., n(π k ) with probability n j, k n 1 + α and to a new cluster with probability α n 1 + α François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 15 / 29
Stick-breaking representation Let Π 1, Π 2,... be the successive partitions obtained with sequential CRP updates Ex: Π 1 = {{1}}, Π 2 = {{1}, {2}}, Π 3 = {{1, 3}, {2}}, Π 4 = {{1, 3}, {2}, {4}},... Let n j,t the size of cluster j in Π t, where the clusters are numbered in order of appearance Let lim t n j,t t where = π j. We have j 1 π j = β j (1 β i ) i=1 β j B(1, α) Stick-breaking construction or residual allocation model François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 16 / 29
Overview 1 Introduction 2 Dirichlet Process Finite mixture model Equivalent classes Dirichlet Process model Chinese Restaurant Stick-breaking construction 3 Nonparametric latent feature models Finite feature model Equivalent classes Nonparametric latent feature model Indian buffet Stick-breaking construction 4 Applications François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 17 / 29
Finite feature model Assume the following model For each feature j = 1,..., K, π j B( α K, 1) For each object k and each feature j c k,j Ber(π j ) François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 18 / 29
Finite feature model Integrating out π 1:K Pr(c 1:n ) = K j=1 α K Γ(m j + α K )Γ(n m j + 1) Γ(n + 1 + α K ) Invariant to permutation of objects/features François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 19 / 29
Equivalent classes Distribution over binary matrix is invariant with respect to permutations of columns Left-ordered form Pr(ξ(c 1:n )) = K! 2 n 1 h=0 K h! K j=1 α K Γ(m j + α K )Γ(n m j + 1) Γ(n + 1 + α K ) François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 20 / 29
Nonparametric latent feature model Limit when K of the finite model is given by Pr(ξ(c 1:n )) = Pr(ξ(c 1:n )) = K! 2 n 1 h=0 K h! K + 2 n 1 K j=1 h=1 K h! exp( αh n) α K Γ(m j + α K )Γ(n m j + 1) Γ(n + 1 + α K ) K + j=1 (n m j )!(m j 1)! n! François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 21 / 29
Indian Buffet Buffet with infinite number of dishes First customer samples Poisson(α) dishes k eme customer takes each dish with probability m j /k and tries Poisson(α/k) new dishes Caution: Not in left-ordered form! François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 22 / 29
Indian Buffet François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 23 / 29
Properties Total number of different features K + follows Poisson(α( n k=1 1 k )) Number of features possessed by each object follows Poisson(α) Total number of entries in the matrix follows Poisson(nα) François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 24 / 29
Stick-breaking construction Let π (1) > π (2) >... > π (K) be a decreasing ordering of π 1:K where π k B( α K, 1) Limit when K : stick-breaking contruction β (k) B(α, 1) and π (k) = π (k 1) β (k) DP: stick lengths sum to one and are not decreasing (only in average) IBP: stick lengths does not sum to one and are decreasing François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 25 / 29
Overview 1 Introduction 2 Dirichlet Process Finite mixture model Equivalent classes Dirichlet Process model Chinese Restaurant Stick-breaking construction 3 Nonparametric latent feature models Finite feature model Equivalent classes Nonparametric latent feature model Indian buffet Stick-breaking construction 4 Applications François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 26 / 29
Applications Choice behaviour Protein interaction screens Structure of causal graphs François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 27 / 29
Bibliography Random partitions and Dirichlet Processes J. Pitman. Exchangeable and partially exchangeable random partitions Probability Theory and Related Fields, 1995. J. Pitman. Some developments of the Blackwell-MacQueen urn scheme In Statistics, probability and game theory; paper in honour of David Blackwell, 1996. Indian buffet Z. Ghahramani, T.L. Griffiths and P. Sollich. Bayesian nonparametric latent feature models Proc. Valencia / ISBA 8th World Meeting on Bayesian Statistics, 2006. T.L. Griffiths and Z. Ghahramani. Infinite latent feature models and the Indian Buffet Process Technical Report, University College, London, 2005. Y.W. Teh, D. Gorur and Z. Ghahramani. Stick-breaking construction for the Indian Buffet Process AISTATS, 2007. François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 28 / 29
Bibliography Applications of Indian buffet D. Gorur, F. Jakel and C. Rasmussen. A choice model with infinitely many latent features. ICML 2006. W. Chu, Z. Ghahramani, R. Krause and D.L. Wild. Identifying protein complexes in high-throughput protein interactions screens using an infinite latent feature model. Biocomputing 2006. F. Wood, T.L. Griffiths and Z. Ghahramani. A nonparametric Bayesian method for inferring hidden causes. UAI 2006. Still hungry? M. McGlohon. Fried Chicken Bucket Processes SIGBOVIK, Pittsburgh, 2007. François Caron (UBC) Bayes. nonparametric latent feature models October 2, 2007 / MLRG 29 / 29