Poisson Latent Feature Calculus for Generalized Indian Buffet Processes

Size: px

Start display at page:

Download "Poisson Latent Feature Calculus for Generalized Indian Buffet Processes"

Bridget Sims
6 years ago
Views:

1 Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Lancelot F. James (paper from arxiv [math.st], Dec 14) Discussion by: Piyush Rai January 23, 2015 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 1

2 Overview of the paper Proposes generalized notions of the Indian Buffet Process (IBP) Allows principled extensions of the standard IBP to more general settings involving non-binary latent features Provides a unified and (perhaps) simpler/cleaner treatment than some other recent attempts to such problems Also allows extensions to multivariate latent features (each latent feature can be a vector - multinomial or a general multivariate vector) Based on ideas from Poisson Partition Calculus (proposed earlier by the same author) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 2

3 The IBP A nonparametric Bayesian prior on random binary matrices Z = [Z 1 ;... ; Z M ] with M rows and unbounded number of columns Prior denoted as IBP(θ): θ > 0, with latent features ω k, k = 1, 2,... Entry (i, k) = 1 if object i has latent feature k, zero otherwise Generative model: Customers (objects) selecting dishes (latent features) Customer 1 selects Poisson(θ) dishes Customer i selects each already selected dish k with prob. m k /i (where m k is # of previous customers who chose dish k), and Poisson(θ/i) new dishes Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 3

4 Generalization of the IBP Standard IBP(θ): Z 1,..., Z M µ µ = k=1 p kδ ωk Atoms ω k drawn from a base measure B 0 p k : points from a Poisson random measure with a restricted mean intensity ρ(s) = θs 1 I {0<s<1} Z i = k=1 b i,kδ ωk b i,k Bernoulli(p k ) Generalized IBP(θ): Z 1,..., Z M µ µ = k=1 τ kδ ωk Atoms ω k drawn from a base measure B 0 τ k : points from a Poisson random measure with general/unrestricted Lévy density ρ(s ω) Z i = k=1 A i,kδ ωk A i,k G A (da τ k ) In the generalized IBP proposed here, A i,k isn t necessarily binary - can be a more general r.v. (or even vector valued) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 4

5 Generalization of the IBP Standard IBP(θ): Z 1,..., Z M µ µ = k=1 p kδ ωk Atoms ω k drawn from a base measure B 0 p k : points from a Poisson random measure with a restricted mean intensity ρ(s) = θs 1 I {0<s<1} Z i = k=1 b i,kδ ωk b i,k Bernoulli(p k ) Generalized IBP(θ): Z 1,..., Z M µ µ = k=1 τ kδ ωk Atoms ω k drawn from a base measure B 0 τ k : points from a Poisson random measure with general/unrestricted Lévy density ρ(s ω) Z i = k=1 A i,kδ ωk A i,k G A (da τ k ) In the generalized IBP proposed here, A i,k isn t necessarily binary - can be a more general r.v. (or even vector valued) Can also think of A i,k = b i,k A i,k where b i,k is binary and A i,k (akin to a spike-and-slab construction) is a general r.v. Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 4

6 This Paper Proposes IBP generalizations of the following form µ = k=1 τ kδ ωk τ k : points from a Poisson random measure with Lévy density ρ(s ω) Z i = k=1 A i,kδ ωk A i,k G A (da τ k ) IBP generalizations that work with any G A and any ρ, so long as G A admits a positive mass at 0 and ρ is a finite measure The constructions rely on Poisson Partition Calculus (PPC) Conjugacy not a requirement No need for combinatorial arguments or long proofs Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 5

7 Notations Recall Z 1,..., Z M µ where µ = k=1 τ kδ ωk Can express µ(dω) = 0 sn(ds, dω) N = k=1 δ τ k, ω k is a Poisson random measure (PRM) with mean intensity Some notations: E[N(ds, dω)] = ρ(s ω)dsb 0 (dω) := ν(ds, dω) N as PRM(ρB 0 ) or PRM(ν), or P(ν) µ is a completely random measure CRM(ρB 0 ) or CRM(ν) Z 1,..., Z M µ as iid IBP(G A µ) The marginal distribution of Z i as IBP(A ρb 0 ) or IBP(A, ν) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 6

8 The General Scheme Generally apply the Poisson Calculus Describe N Z 1,..., Z M Consequently describe µ Z 1,..., Z M Describe the marginal structure P(Z 1,..., Z M ) via integration Provide descriptions for the marginal ξ(ϕ) Z = X k δ ωk k=1 and introduce variables (H i, X i ) leading to tractable sampling for many ρ Describe Z M+1 Z 1,..., Z M Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 7

9 Two Ingredients from Poisson Partition Calculus Laplace functional exponential change: updating a Poisson random measure e N(f ) P(dN ν) = P(dN ν f )e Ψ(f ) where ν f (ds, dω) = e f (s,ω) ρ(s ω)dsb 0 (dω) and e Ψ(f ) = E ν [e N(f ) ] and Ψ(f ) = (1 e f (s,ω) )ρ(s ω)dsb 0 (dω) Ω 0 Using ν f, apply the moment measure calculation [ L ] K N(dW i ) P(dN ν f ) = P(dN ν f, s, ω) e f (s l,ω l ) ν(ds l, dω l ) i=1 l=1 where P(dN ν f, s, ω) corresponds to the law of the random measure K Ñ + δ sl,ω l where Ñ is also PRM(ν f ) l=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 8

10 Joint Distribution Joint distribution of A (or, equivalently, of Z = [Z 1 ;... ; Z M ]) M [G A (da i,j τ j )] I {a i,j 0} [1 P(A 0 τ j )] 1 I {a i,j 0} i=1 j=1 and setting π A (s) = P(A 0 s), the above becomes [ M e j=1 [ M log(1 π A(τ j ))] i=1 j=1 G A (da i,j τ j ) 1 π A (τ j ) ] I{ai,j 0} After some further simplifications, the joint distribution of ((Z 1,..., Z M ), (J 1,..., J K ), N) where (J 1 = s 1,..., J K = s K ) are the unique jumps, becomes K M P(dN ν fm, s, ω)e Ψ(f M ) [1 π A (s l )] M ρ(s l ω l )B 0 (dω l ) h i,l (s l ) l=1 where h i,l (s) = [ G A(da i,l s) 1 π A (s) ]I {a i,l 0} and Ψ(f ) = (1 [1 π Ω 0 A (s)] M )ρ(s ω)dsb 0(dω) i=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 9

11 Posterior and Marginal Distribution Integrating out N leads to the joint distribution of ((Z i ), (J l )) e Ψ(f M ) K M [1 π A (s l )] M ρ(s l ω l )B 0 (dω l ) h i,l (s l ) l=1 i=1 (J l ) (Z i ) are conditionally independent with density M P l,m (J l ds) [1 π A (s)] M ρ(s ω l ) h i,l (s)ds.. and the marginal of (Z 1,..., Z M ) is [ K ] M P(Z 1,..., Z M ) = e Ψ(f M ) [1 π A (s)] M ρ(s ω l ) h i,l (s)ds B 0 (dω l ) i=1 l=1 0 i=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 10

12 Main Result Notation: ν fm (ds, dω) := ρ M (s ω)b 0 (dω)ds where ρ M (s ω) = [1 π A (s)] M ρ(s ω) Theorem: Suppose that Z 1,..., Z M µ are iid IBP(G A µ), µ is CRM(ρB 0 ), then The posterior distribution of N Z 1,..., Z M is equivalent to the distribution of K N M + l=1 δ Jl,ω l where N M is PRM(ρ M B 0 ), and the distribution of (J l ) is (cf, previous slide) The posterior distribution of µ Z 1,..., Z M is equivalent to the distribution of K µ M + J l δ ωl where µ M is CRM(ρ M B 0 ) l=1 The marginal distribution of (Z 1,..., Z M ) is (cf, previous slide) Proposition: Suppose Z 1,..., Z M, Z M+1 µ are iid IBP(G A µ) then K Z M+1 Z 1,..., Z M Z M+1 + A l δ ωl where Z M+1 is IBP(A, ρ M B 0 ) and each A l J l = s has distribution G A (da s) l=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 11

13 Special Cases: Bernoulli, Poisson, Negative-Binomial Bernoulli: G A (. s) = Bernoulli(s) Poisson: G A (. s) = Poisson(bs) P(A = 0 s) = 1 π A (s) = 1 s P(A = 0 s) = 1 π A (s) = e bs Negative Binomial: G A (. s) = NB(r, s) where r is the overdispersion param. P(A = 0 s) = 1 π A (s) = (1 s) r Setting c l,m = M i=1 a i,l, we have M i=1 h il(s) given by ( s 1 s ) cl,m, bc l,m s c l,m M i=1 a i,l!, and sc l,m M ( ) ai,l + r 1 for Bernoulli, Poisson, and NB cases, respectively (for the homogeneous case ρ(s); the case ρ(s ω) can be likewise worked out) i=1 a i,l Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 12

14 Describing posterior, marginal, and other quantities Specify G A (s) Identify π A (s) = P(A 0 s) For each M = 0, 1, 2,..., define ρ M (s) = [1 π A (s)] M ρ(s ω) Specify N M PRM(ρ M B 0 ), µ M CRM(ρ M B 0 ) and Z M+1 IBP(A, ρ M B 0 ) Identify h i,l (s) and if possible try to simplify M i=1 h i,l(s) where h i,l (s) = [ ] I{ai,l 0} G A (da i,l s l ) 1 π A (s l ) Use this to get the distribution of (J l ) and the marginal of (Z 1,..., Z M ) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 13

15 Steps for sequential generalized IBP construction For each M = 0, 1, 2,... define ρ M (s) = [1 π A (s)] M ρ(s) Calculate and check ϕ M+1 (ρ) = 0 π A (s)ρ M (s)ds < Then if Z M+1 is IBP(A, ρ M B 0 ), ξ(ϕ) Z M+1 = X k δ ωk where ϕ := ϕ M+1 (ρ) and ξ(ϕ) is a Poisson(ϕ) r.v., and (X k ) are taken from an IBP(A, ρ M ) process ((X i, H i )) X i H i = s has distribution I {a 0}G A (da s) π A (s) k=1 X i has marginal distribution P(X i da) = I {a 0} and H i has marginal density π A(s)ρ M (s) 0 G A (da s)ρ M (s)ds ϕ M+1 (ρ) ϕ M+1 (ρ) and the distribution of H i X i = a is given by P(H i ds X i = a) = I {a 0}G A (da s)ρ M (s)ds G 0 A (da v)ρ M (v)dv Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 14

16 Generalized IBP: The Sequential Generative Process Sequential generative process for IBP(A, ρb 0 ): Customer 1 selects dishes and scores them according to IBP(A, ρb 0 ) as Draw Poisson(ϕ 1(ρ)) = J number of variables Draw ((ω 1, H 1),..., (ω J, H J )) iid from B 0 and distribution for H i with M = 0 Draw X i H i, for i = 1,..., J, or draw X i directly (if so possible) After M customers have chosen K dishes ω 1,..., ω K, customer M + 1: Selects/skips each existing dish ω l and scores (if selected) it according to A l, where A l J l has distribution G A (da J l ). The probability that A l J l takes value zero is 1 π A (J l ) Also chooses and scores new dishes according to IBP(A, ρ M B 0) process Z M+1 with ρ M (s) = [1 π A (s)] M ρ(s) Note: Specific cases when A is Poisson and Negative-Binomial are shown in the paper (sec 4.1 and 4.2) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 15

17 Multivariate Generalizations Multivariate CRM is also considered. Generalizes the IBP for vector-valued A i,k Consider µ 0 = (µ j, j = 1,..., q) with µ j = k=1 p j,kδ ωk (note: all µ j s assumed to the same base measure) µ. = q j=1 µ j = k=1 p.,kδ ωk for p.,k = q j=1 p j,k The multivariate CRM µ 0 can be constructed from a Poisson random measure on (R q +, Ω) N := δ pq,k, ω k where p q,k = (p 1,k,..., p q,k ) k=1 A simple case (sec 5.1 in the paper): Multinomial extension of the IBP. The joint probability of the M K IBP matrix (each entry a multinomial vector) [ K q ( pj,l l=1 j=1 1 p.,l ) cj,l,m ] e M j=1 [ log(1 p.,j )].. akin to an IBP with a condiment. All the theorems generalize for this case too. Other multivariate generalizations also possible (sec 5.2 in the paper) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 16

18 Multivariate Generalizations A more general case (sec 5.2): Let A 0 := (A 1,..., A v ) R v with distribution G A0 (. s q ) where s q R q + 1 π A0 = P(A 1 = 0,..., A v = 0 s q ) > 0 Define a vector-valued process Z (i) 0 := (Z (i) 1,..., Z v (i) ) where Z (i) Z (i). = v j=1 Z (i) j = k=1 A (i).,k δ ω k j = k=1 A(i) j,k δ ω k For each (i, k), A (i) 0,k := (A(i) 1,k,..., A(i) v,k ) is independent G A 0 (. s q,k ) where s q,k are vector-valued points of a PRM with intensity ρ q (. ω) We say that Z (1) 0,..., Z (M) 0 µ 0 are iid IBP(G A0 µ 0 ), with the likelihood being e j=1 [ M log(1 π A 0 (s q,j ))] [ M GA0 (da (i) 0,j s q,j) 1 π A0 (s q,j ) i=1 j=1 ] I{ai,j 0} Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 17

19 Thanks! Questions? Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 18

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv: v3 [math.st] 20 Dec 2014

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv:1411.2936v3 [math.st] 2 Dec 214 Lancelot F. James 1 Lancelot F. James The Hong Kong University of Science and Technology, Department