Poisson Latent Feature Calculus for Generalized Indian Buffet Processes

Size: px
Start display at page:

Download "Poisson Latent Feature Calculus for Generalized Indian Buffet Processes"

Transcription

1 Poisson Latent Feature Calculus for Generalized Indian Buffet Processes Lancelot F. James (paper from arxiv [math.st], Dec 14) Discussion by: Piyush Rai January 23, 2015 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 1

2 Overview of the paper Proposes generalized notions of the Indian Buffet Process (IBP) Allows principled extensions of the standard IBP to more general settings involving non-binary latent features Provides a unified and (perhaps) simpler/cleaner treatment than some other recent attempts to such problems Also allows extensions to multivariate latent features (each latent feature can be a vector - multinomial or a general multivariate vector) Based on ideas from Poisson Partition Calculus (proposed earlier by the same author) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 2

3 The IBP A nonparametric Bayesian prior on random binary matrices Z = [Z 1 ;... ; Z M ] with M rows and unbounded number of columns Prior denoted as IBP(θ): θ > 0, with latent features ω k, k = 1, 2,... Entry (i, k) = 1 if object i has latent feature k, zero otherwise Generative model: Customers (objects) selecting dishes (latent features) Customer 1 selects Poisson(θ) dishes Customer i selects each already selected dish k with prob. m k /i (where m k is # of previous customers who chose dish k), and Poisson(θ/i) new dishes Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 3

4 Generalization of the IBP Standard IBP(θ): Z 1,..., Z M µ µ = k=1 p kδ ωk Atoms ω k drawn from a base measure B 0 p k : points from a Poisson random measure with a restricted mean intensity ρ(s) = θs 1 I {0<s<1} Z i = k=1 b i,kδ ωk b i,k Bernoulli(p k ) Generalized IBP(θ): Z 1,..., Z M µ µ = k=1 τ kδ ωk Atoms ω k drawn from a base measure B 0 τ k : points from a Poisson random measure with general/unrestricted Lévy density ρ(s ω) Z i = k=1 A i,kδ ωk A i,k G A (da τ k ) In the generalized IBP proposed here, A i,k isn t necessarily binary - can be a more general r.v. (or even vector valued) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 4

5 Generalization of the IBP Standard IBP(θ): Z 1,..., Z M µ µ = k=1 p kδ ωk Atoms ω k drawn from a base measure B 0 p k : points from a Poisson random measure with a restricted mean intensity ρ(s) = θs 1 I {0<s<1} Z i = k=1 b i,kδ ωk b i,k Bernoulli(p k ) Generalized IBP(θ): Z 1,..., Z M µ µ = k=1 τ kδ ωk Atoms ω k drawn from a base measure B 0 τ k : points from a Poisson random measure with general/unrestricted Lévy density ρ(s ω) Z i = k=1 A i,kδ ωk A i,k G A (da τ k ) In the generalized IBP proposed here, A i,k isn t necessarily binary - can be a more general r.v. (or even vector valued) Can also think of A i,k = b i,k A i,k where b i,k is binary and A i,k (akin to a spike-and-slab construction) is a general r.v. Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 4

6 This Paper Proposes IBP generalizations of the following form µ = k=1 τ kδ ωk τ k : points from a Poisson random measure with Lévy density ρ(s ω) Z i = k=1 A i,kδ ωk A i,k G A (da τ k ) IBP generalizations that work with any G A and any ρ, so long as G A admits a positive mass at 0 and ρ is a finite measure The constructions rely on Poisson Partition Calculus (PPC) Conjugacy not a requirement No need for combinatorial arguments or long proofs Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 5

7 Notations Recall Z 1,..., Z M µ where µ = k=1 τ kδ ωk Can express µ(dω) = 0 sn(ds, dω) N = k=1 δ τ k, ω k is a Poisson random measure (PRM) with mean intensity Some notations: E[N(ds, dω)] = ρ(s ω)dsb 0 (dω) := ν(ds, dω) N as PRM(ρB 0 ) or PRM(ν), or P(ν) µ is a completely random measure CRM(ρB 0 ) or CRM(ν) Z 1,..., Z M µ as iid IBP(G A µ) The marginal distribution of Z i as IBP(A ρb 0 ) or IBP(A, ν) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 6

8 The General Scheme Generally apply the Poisson Calculus Describe N Z 1,..., Z M Consequently describe µ Z 1,..., Z M Describe the marginal structure P(Z 1,..., Z M ) via integration Provide descriptions for the marginal ξ(ϕ) Z = X k δ ωk k=1 and introduce variables (H i, X i ) leading to tractable sampling for many ρ Describe Z M+1 Z 1,..., Z M Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 7

9 Two Ingredients from Poisson Partition Calculus Laplace functional exponential change: updating a Poisson random measure e N(f ) P(dN ν) = P(dN ν f )e Ψ(f ) where ν f (ds, dω) = e f (s,ω) ρ(s ω)dsb 0 (dω) and e Ψ(f ) = E ν [e N(f ) ] and Ψ(f ) = (1 e f (s,ω) )ρ(s ω)dsb 0 (dω) Ω 0 Using ν f, apply the moment measure calculation [ L ] K N(dW i ) P(dN ν f ) = P(dN ν f, s, ω) e f (s l,ω l ) ν(ds l, dω l ) i=1 l=1 where P(dN ν f, s, ω) corresponds to the law of the random measure K Ñ + δ sl,ω l where Ñ is also PRM(ν f ) l=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 8

10 Joint Distribution Joint distribution of A (or, equivalently, of Z = [Z 1 ;... ; Z M ]) M [G A (da i,j τ j )] I {a i,j 0} [1 P(A 0 τ j )] 1 I {a i,j 0} i=1 j=1 and setting π A (s) = P(A 0 s), the above becomes [ M e j=1 [ M log(1 π A(τ j ))] i=1 j=1 G A (da i,j τ j ) 1 π A (τ j ) ] I{ai,j 0} After some further simplifications, the joint distribution of ((Z 1,..., Z M ), (J 1,..., J K ), N) where (J 1 = s 1,..., J K = s K ) are the unique jumps, becomes K M P(dN ν fm, s, ω)e Ψ(f M ) [1 π A (s l )] M ρ(s l ω l )B 0 (dω l ) h i,l (s l ) l=1 where h i,l (s) = [ G A(da i,l s) 1 π A (s) ]I {a i,l 0} and Ψ(f ) = (1 [1 π Ω 0 A (s)] M )ρ(s ω)dsb 0(dω) i=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 9

11 Posterior and Marginal Distribution Integrating out N leads to the joint distribution of ((Z i ), (J l )) e Ψ(f M ) K M [1 π A (s l )] M ρ(s l ω l )B 0 (dω l ) h i,l (s l ) l=1 i=1 (J l ) (Z i ) are conditionally independent with density M P l,m (J l ds) [1 π A (s)] M ρ(s ω l ) h i,l (s)ds.. and the marginal of (Z 1,..., Z M ) is [ K ] M P(Z 1,..., Z M ) = e Ψ(f M ) [1 π A (s)] M ρ(s ω l ) h i,l (s)ds B 0 (dω l ) i=1 l=1 0 i=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 10

12 Main Result Notation: ν fm (ds, dω) := ρ M (s ω)b 0 (dω)ds where ρ M (s ω) = [1 π A (s)] M ρ(s ω) Theorem: Suppose that Z 1,..., Z M µ are iid IBP(G A µ), µ is CRM(ρB 0 ), then The posterior distribution of N Z 1,..., Z M is equivalent to the distribution of K N M + l=1 δ Jl,ω l where N M is PRM(ρ M B 0 ), and the distribution of (J l ) is (cf, previous slide) The posterior distribution of µ Z 1,..., Z M is equivalent to the distribution of K µ M + J l δ ωl where µ M is CRM(ρ M B 0 ) l=1 The marginal distribution of (Z 1,..., Z M ) is (cf, previous slide) Proposition: Suppose Z 1,..., Z M, Z M+1 µ are iid IBP(G A µ) then K Z M+1 Z 1,..., Z M Z M+1 + A l δ ωl where Z M+1 is IBP(A, ρ M B 0 ) and each A l J l = s has distribution G A (da s) l=1 Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 11

13 Special Cases: Bernoulli, Poisson, Negative-Binomial Bernoulli: G A (. s) = Bernoulli(s) Poisson: G A (. s) = Poisson(bs) P(A = 0 s) = 1 π A (s) = 1 s P(A = 0 s) = 1 π A (s) = e bs Negative Binomial: G A (. s) = NB(r, s) where r is the overdispersion param. P(A = 0 s) = 1 π A (s) = (1 s) r Setting c l,m = M i=1 a i,l, we have M i=1 h il(s) given by ( s 1 s ) cl,m, bc l,m s c l,m M i=1 a i,l!, and sc l,m M ( ) ai,l + r 1 for Bernoulli, Poisson, and NB cases, respectively (for the homogeneous case ρ(s); the case ρ(s ω) can be likewise worked out) i=1 a i,l Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 12

14 Describing posterior, marginal, and other quantities Specify G A (s) Identify π A (s) = P(A 0 s) For each M = 0, 1, 2,..., define ρ M (s) = [1 π A (s)] M ρ(s ω) Specify N M PRM(ρ M B 0 ), µ M CRM(ρ M B 0 ) and Z M+1 IBP(A, ρ M B 0 ) Identify h i,l (s) and if possible try to simplify M i=1 h i,l(s) where h i,l (s) = [ ] I{ai,l 0} G A (da i,l s l ) 1 π A (s l ) Use this to get the distribution of (J l ) and the marginal of (Z 1,..., Z M ) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 13

15 Steps for sequential generalized IBP construction For each M = 0, 1, 2,... define ρ M (s) = [1 π A (s)] M ρ(s) Calculate and check ϕ M+1 (ρ) = 0 π A (s)ρ M (s)ds < Then if Z M+1 is IBP(A, ρ M B 0 ), ξ(ϕ) Z M+1 = X k δ ωk where ϕ := ϕ M+1 (ρ) and ξ(ϕ) is a Poisson(ϕ) r.v., and (X k ) are taken from an IBP(A, ρ M ) process ((X i, H i )) X i H i = s has distribution I {a 0}G A (da s) π A (s) k=1 X i has marginal distribution P(X i da) = I {a 0} and H i has marginal density π A(s)ρ M (s) 0 G A (da s)ρ M (s)ds ϕ M+1 (ρ) ϕ M+1 (ρ) and the distribution of H i X i = a is given by P(H i ds X i = a) = I {a 0}G A (da s)ρ M (s)ds G 0 A (da v)ρ M (v)dv Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 14

16 Generalized IBP: The Sequential Generative Process Sequential generative process for IBP(A, ρb 0 ): Customer 1 selects dishes and scores them according to IBP(A, ρb 0 ) as Draw Poisson(ϕ 1(ρ)) = J number of variables Draw ((ω 1, H 1),..., (ω J, H J )) iid from B 0 and distribution for H i with M = 0 Draw X i H i, for i = 1,..., J, or draw X i directly (if so possible) After M customers have chosen K dishes ω 1,..., ω K, customer M + 1: Selects/skips each existing dish ω l and scores (if selected) it according to A l, where A l J l has distribution G A (da J l ). The probability that A l J l takes value zero is 1 π A (J l ) Also chooses and scores new dishes according to IBP(A, ρ M B 0) process Z M+1 with ρ M (s) = [1 π A (s)] M ρ(s) Note: Specific cases when A is Poisson and Negative-Binomial are shown in the paper (sec 4.1 and 4.2) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 15

17 Multivariate Generalizations Multivariate CRM is also considered. Generalizes the IBP for vector-valued A i,k Consider µ 0 = (µ j, j = 1,..., q) with µ j = k=1 p j,kδ ωk (note: all µ j s assumed to the same base measure) µ. = q j=1 µ j = k=1 p.,kδ ωk for p.,k = q j=1 p j,k The multivariate CRM µ 0 can be constructed from a Poisson random measure on (R q +, Ω) N := δ pq,k, ω k where p q,k = (p 1,k,..., p q,k ) k=1 A simple case (sec 5.1 in the paper): Multinomial extension of the IBP. The joint probability of the M K IBP matrix (each entry a multinomial vector) [ K q ( pj,l l=1 j=1 1 p.,l ) cj,l,m ] e M j=1 [ log(1 p.,j )].. akin to an IBP with a condiment. All the theorems generalize for this case too. Other multivariate generalizations also possible (sec 5.2 in the paper) Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 16

18 Multivariate Generalizations A more general case (sec 5.2): Let A 0 := (A 1,..., A v ) R v with distribution G A0 (. s q ) where s q R q + 1 π A0 = P(A 1 = 0,..., A v = 0 s q ) > 0 Define a vector-valued process Z (i) 0 := (Z (i) 1,..., Z v (i) ) where Z (i) Z (i). = v j=1 Z (i) j = k=1 A (i).,k δ ω k j = k=1 A(i) j,k δ ω k For each (i, k), A (i) 0,k := (A(i) 1,k,..., A(i) v,k ) is independent G A 0 (. s q,k ) where s q,k are vector-valued points of a PRM with intensity ρ q (. ω) We say that Z (1) 0,..., Z (M) 0 µ 0 are iid IBP(G A0 µ 0 ), with the likelihood being e j=1 [ M log(1 π A 0 (s q,j ))] [ M GA0 (da (i) 0,j s q,j) 1 π A0 (s q,j ) i=1 j=1 ] I{ai,j 0} Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 17

19 Thanks! Questions? Lancelot F. James () Poisson Latent Feature Calculus for Generalized IBP 18

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv: v3 [math.st] 20 Dec 2014

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv: v3 [math.st] 20 Dec 2014 Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv:1411.2936v3 [math.st] 2 Dec 214 Lancelot F. James 1 Lancelot F. James The Hong Kong University of Science and Technology, Department

More information

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv: v2 [math.st] 17 Nov 2014

Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv: v2 [math.st] 17 Nov 2014 Poisson Latent Feature Calculus for Generalized Indian Buffet Processes arxiv:1411.2936v2 [math.st] 17 Nov 214 Lancelot F. James 1 Lancelot F. James The Hong Kong University of Science and Technology,

More information

Priors for Random Count Matrices with Random or Fixed Row Sums

Priors for Random Count Matrices with Random or Fixed Row Sums Priors for Random Count Matrices with Random or Fixed Row Sums Mingyuan Zhou Joint work with Oscar Madrid and James Scott IROM Department, McCombs School of Business Department of Statistics and Data Sciences

More information

Infinite latent feature models and the Indian Buffet Process

Infinite latent feature models and the Indian Buffet Process p.1 Infinite latent feature models and the Indian Buffet Process Tom Griffiths Cognitive and Linguistic Sciences Brown University Joint work with Zoubin Ghahramani p.2 Beyond latent classes Unsupervised

More information

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process 10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable

More information

arxiv: v2 [stat.ml] 10 Sep 2012

arxiv: v2 [stat.ml] 10 Sep 2012 Distance Dependent Infinite Latent Feature Models arxiv:1110.5454v2 [stat.ml] 10 Sep 2012 Samuel J. Gershman 1, Peter I. Frazier 2 and David M. Blei 3 1 Department of Psychology and Princeton Neuroscience

More information

On the posterior structure of NRMI

On the posterior structure of NRMI On the posterior structure of NRMI Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with L.F. James and A. Lijoi Isaac Newton Institute, BNR Programme, 8th August 2007 Outline

More information

Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes: Supplementary Material

Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes: Supplementary Material Priors for Random Count Matrices Derived from a Family of Negative Binomial Processes: Supplementary Material A The Negative Binomial Process: Details A. Negative binomial process random count matrix To

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models Indian Buffet process, beta process, and related models François Caron Department of Statistics, Oxford Applied Bayesian Statistics Summer School Como, Italy

More information

Dependent hierarchical processes for multi armed bandits

Dependent hierarchical processes for multi armed bandits Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino

More information

Bayesian non parametric approaches: an introduction

Bayesian non parametric approaches: an introduction Introduction Latent class models Latent feature models Conclusion & Perspectives Bayesian non parametric approaches: an introduction Pierre CHAINAIS Bordeaux - nov. 2012 Trajectory 1 Bayesian non parametric

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Feature Allocations, Probability Functions, and Paintboxes

Feature Allocations, Probability Functions, and Paintboxes Feature Allocations, Probability Functions, and Paintboxes Tamara Broderick, Jim Pitman, Michael I. Jordan Abstract The problem of inferring a clustering of a data set has been the subject of much research

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Question: My computer only knows how to generate a uniform random variable. How do I generate others? f X (x)dx. f X (s)ds.

Question: My computer only knows how to generate a uniform random variable. How do I generate others? f X (x)dx. f X (s)ds. Simulation Question: My computer only knows how to generate a uniform random variable. How do I generate others?. Continuous Random Variables Recall that a random variable X is continuous if it has a probability

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Classical and Bayesian inference

Classical and Bayesian inference Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 11 The Prior Distribution Definition Suppose that one has a statistical model with parameter

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Negative Binomial Process Count and Mixture Modeling

Negative Binomial Process Count and Mixture Modeling Negative Binomial Process Count and Mixture Modeling Mingyuan Zhou and Lawrence Carin Abstract The seemingly disjoint problems of count and mixture modeling are united under the negative binomial NB process.

More information

Infinite Latent Feature Models and the Indian Buffet Process

Infinite Latent Feature Models and the Indian Buffet Process Infinite Latent Feature Models and the Indian Buffet Process Thomas L. Griffiths Cognitive and Linguistic Sciences Brown University, Providence RI 292 tom griffiths@brown.edu Zoubin Ghahramani Gatsby Computational

More information

Lecture 10: Semi-Markov Type Processes

Lecture 10: Semi-Markov Type Processes Lecture 1: Semi-Markov Type Processes 1. Semi-Markov processes (SMP) 1.1 Definition of SMP 1.2 Transition probabilities for SMP 1.3 Hitting times and semi-markov renewal equations 2. Processes with semi-markov

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Sargur Srihari srihari@cedar.buffalo.edu 1 Topics 1. What are probabilistic graphical models (PGMs) 2. Use of PGMs Engineering and AI 3. Directionality in

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

Lecture 13 and 14: Bayesian estimation theory

Lecture 13 and 14: Bayesian estimation theory 1 Lecture 13 and 14: Bayesian estimation theory Spring 2012 - EE 194 Networked estimation and control (Prof. Khan) March 26 2012 I. BAYESIAN ESTIMATORS Mother Nature conducts a random experiment that generates

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Classical and Bayesian inference

Classical and Bayesian inference Classical and Bayesian inference AMS 132 January 18, 2018 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 18, 2018 1 / 9 Sampling from a Bernoulli Distribution Theorem (Beta-Bernoulli

More information

Beta processes, stick-breaking, and power laws

Beta processes, stick-breaking, and power laws Beta processes, stick-breaking, and power laws T. Broderick, M. Jordan, J. Pitman Presented by Jixiong Wang & J. Li November 17, 2011 DP vs. BP Dirichlet Process Beta Process DP vs. BP Dirichlet Process

More information

The Indian Buffet Process: An Introduction and Review

The Indian Buffet Process: An Introduction and Review Journal of Machine Learning Research 12 (2011) 1185-1224 Submitted 3/10; Revised 3/11; Published 4/11 The Indian Buffet Process: An Introduction and Review Thomas L. Griffiths Department of Psychology

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

5. Conditional Distributions

5. Conditional Distributions 1 of 12 7/16/2009 5:36 AM Virtual Laboratories > 3. Distributions > 1 2 3 4 5 6 7 8 5. Conditional Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Describing Contingency tables

Describing Contingency tables Today s topics: Describing Contingency tables 1. Probability structure for contingency tables (distributions, sensitivity/specificity, sampling schemes). 2. Comparing two proportions (relative risk, odds

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

arxiv: v1 [stat.ml] 20 Nov 2012

arxiv: v1 [stat.ml] 20 Nov 2012 A survey of non-exchangeable priors for Bayesian nonparametric models arxiv:1211.4798v1 [stat.ml] 20 Nov 2012 Nicholas J. Foti 1 and Sinead Williamson 2 1 Department of Computer Science, Dartmouth College

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Compound Random Measures

Compound Random Measures Compound Random Measures Jim Griffin (joint work with Fabrizio Leisen) University of Kent Introduction: Two clinical studies 3 CALGB8881 3 CALGB916 2 2 β 1 1 β 1 1 1 5 5 β 1 5 5 β Infinite mixture models

More information

Stat 451 Lecture Notes Numerical Integration

Stat 451 Lecture Notes Numerical Integration Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29

More information

More on nuisance parameters

More on nuisance parameters BS2 Statistical Inference, Lecture 3, Hilary Term 2009 January 30, 2009 Suppose that there is a minimal sufficient statistic T = t(x ) partitioned as T = (S, C) = (s(x ), c(x )) where: C1: the distribution

More information

Chapter 4: Modelling

Chapter 4: Modelling Chapter 4: Modelling Exchangeability and Invariance Markus Harva 17.10. / Reading Circle on Bayesian Theory Outline 1 Introduction 2 Models via exchangeability 3 Models via invariance 4 Exercise Statistical

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions

Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Bayesian Nonparametrics: some contributions to construction and properties of prior distributions Annalisa Cerquetti Collegio Nuovo, University of Pavia, Italy Interview Day, CETL Lectureship in Statistics,

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron INRIA IMB - University of Bordeaux Talence, France Francois.Caron@inria.fr Abstract We develop a novel Bayesian nonparametric model for

More information

Nonparametric Factor Analysis with Beta Process Priors

Nonparametric Factor Analysis with Beta Process Priors Nonparametric Factor Analysis with Beta Process Priors John Paisley Lawrence Carin Department of Electrical & Computer Engineering Duke University, Durham, NC 7708 jwp4@ee.duke.edu lcarin@ee.duke.edu Abstract

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Bayesian nonparametric models of sparse and exchangeable random graphs

Bayesian nonparametric models of sparse and exchangeable random graphs Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /

More information

State-Space Methods for Inferring Spike Trains from Calcium Imaging

State-Space Methods for Inferring Spike Trains from Calcium Imaging State-Space Methods for Inferring Spike Trains from Calcium Imaging Joshua Vogelstein Johns Hopkins April 23, 2009 Joshua Vogelstein (Johns Hopkins) State-Space Calcium Imaging April 23, 2009 1 / 78 Outline

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

arxiv: v2 [math.st] 22 Apr 2016

arxiv: v2 [math.st] 22 Apr 2016 arxiv:1410.6843v2 [math.st] 22 Apr 2016 Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick Ashia C. Wilson Michael I. Jordan April 25, 2016 Abstract We demonstrate

More information

Hierarchical Modelling for Univariate Spatial Data

Hierarchical Modelling for Univariate Spatial Data Hierarchical Modelling for Univariate Spatial Data Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9

Metropolis Hastings. Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601. Module 9 Metropolis Hastings Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Module 9 1 The Metropolis-Hastings algorithm is a general term for a family of Markov chain simulation methods

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

Chapter 16. Structured Probabilistic Models for Deep Learning

Chapter 16. Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 1 Chapter 16 Structured Probabilistic Models for Deep Learning Peng et al.: Deep Learning and Practice 2 Structured Probabilistic Models way of using graphs to describe

More information

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism Aaron C. Courville, Douglas Eck and Yoshua Bengio Department of Computer Science and Operations Research University of Montréal Montréal, Québec,

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm 1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

A Few Special Distributions and Their Properties

A Few Special Distributions and Their Properties A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution

More information

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58

Overview of Course. Nevin L. Zhang (HKUST) Bayesian Networks Fall / 58 Overview of Course So far, we have studied The concept of Bayesian network Independence and Separation in Bayesian networks Inference in Bayesian networks The rest of the course: Data analysis using Bayesian

More information

2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,

2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Lecture 11 CRFs, Exponential Family CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due today Project milestones due next Monday (Nov 9) About half the work should

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Meixner matrix ensembles

Meixner matrix ensembles Meixner matrix ensembles W lodek Bryc 1 Cincinnati April 12, 2011 1 Based on joint work with Gerard Letac W lodek Bryc (Cincinnati) Meixner matrix ensembles April 12, 2011 1 / 29 Outline of talk Random

More information

Clustering bi-partite networks using collapsed latent block models

Clustering bi-partite networks using collapsed latent block models Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space

More information

Markov Chains and Hidden Markov Models

Markov Chains and Hidden Markov Models Chapter 1 Markov Chains and Hidden Markov Models In this chapter, we will introduce the concept of Markov chains, and show how Markov chains can be used to model signals using structures such as hidden

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Gaussian Mixture Models, Expectation Maximization

Gaussian Mixture Models, Expectation Maximization Gaussian Mixture Models, Expectation Maximization Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Andrew Moore (CMU), Eric Eaton (UPenn), David Kauchak

More information

Stat 451 Lecture Notes Simulating Random Variables

Stat 451 Lecture Notes Simulating Random Variables Stat 451 Lecture Notes 05 12 Simulating Random Variables Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 22 in Lange, and Chapter 2 in Robert & Casella 2 Updated:

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Poisson Processes for Neuroscientists

Poisson Processes for Neuroscientists Poisson Processes for Neuroscientists Thibaud Taillefumier This note is an introduction to the key properties of Poisson processes, which are extensively used to simulate spike trains. For being mathematical

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

Mixed Membership Stochastic Blockmodels

Mixed Membership Stochastic Blockmodels Mixed Membership Stochastic Blockmodels (2008) Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg and Eric P. Xing Herrissa Lamothe Princeton University Herrissa Lamothe (Princeton University) Mixed

More information

Estimating the Number of Tables via Sequential Importance Sampling

Estimating the Number of Tables via Sequential Importance Sampling Estimating the Number of Tables via Sequential Importance Sampling Jing Xi Department of Statistics University of Kentucky Jing Xi, Ruriko Yoshida, David Haws Introduction combinatorics social networks

More information

Order Statistics and Distributions

Order Statistics and Distributions Order Statistics and Distributions 1 Some Preliminary Comments and Ideas In this section we consider a random sample X 1, X 2,..., X n common continuous distribution function F and probability density

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro

CS37300 Class Notes. Jennifer Neville, Sebastian Moreno, Bruno Ribeiro CS37300 Class Notes Jennifer Neville, Sebastian Moreno, Bruno Ribeiro 2 Background on Probability and Statistics These are basic definitions, concepts, and equations that should have been covered in your

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information