Compound Random Measures

Size: px
Start display at page:

Download "Compound Random Measures"

Transcription

1 Compound Random Measures Jim Griffin (joint work with Fabrizio Leisen) University of Kent

2 Introduction: Two clinical studies 3 CALGB CALGB β 1 1 β β β

3 Infinite mixture models This data could be analysed using two infinite mixture models CALGB8881: f 1 (y) = j=1 w (1) j k(y θ j ) and CALGB916: f 2 (y) = j=1 w (2) j k(y θ j ) where w (k) j > for j = 1, 2,... and j=1 w (k) j = 1 for k = 1, 2. k(y θ) is a p.d.f. for y with parameter θ. We need to put a prior on the w (1), w (2) and θ (random probability measure).

4 Some dependent random probability measures: stick-breaking θ are i.i.d. and w (k) j = V (k) j ( i<j 1 V (k) i ) Hierarchical Dirichlet ( Process (Teh et al, 26): Be (α β j, α 1 )) j l=1 β l, V (k) j β j Be(1, γ), j 1 β j = β j (1 β l ), ( ) Probit stick-breaking processes, etc.: V (1) j, V (2) j are ( ) correlated and independent of V (1) i, V (2) i for i j. l=1

5 Completely random measures µ is a completely random measure (CRM) on Θ if, for any disjoint subsets A 1,..., A n, µ(a 1 )..., µ(a n ) are mutually independent. We concentrate on completely random measures (CRM s) which can be represented in terms of jump sizes J i and jump locations θ i as µ = J i δ θi where δ is Dirac s delta function and have Lévy-Khintchine representation E [e ] f (θ) µ(dθ) = e [1 e sf (θ) ]α(dθ)ρ(ds) i=1 where α and ρ are measures for which α(dθ) <.

6 J Completely random measures Poisson process with intensity α(dθ)ρ(ds)

7 Examples of CRM s Many processes that we use in Bayesian nonparametrics are CRM s Gamma process - ρ(ds) = s 1 exp{ s} ds. Beta process - ρ(ds) = βs 1 (1 s) β 1 ds. or can be derived from CRM s Normalizing a Gamma process, i.e. taking p = µ/ µ(θ), leads to a Dirichlet process. A beta process prior for p 1, p 2,... can be used to define an Indian buffet process.

8 Vectors of CRMs It is useful to define d related CRM s. Suppose that µ 1,..., µ d are CRM s on Θ with marginal Lévy intensities ν j (ds, dθ) = ν j (ds)α(dθ) Then µ 1,..., µ d are a vector of CRM s if there is a Lévy-Khintchine representation of the form [ ] E e µ(f 1) µ d (f d ) = e ψ ρ,d (f 1,...,f d ) where ψρ,d(f 1,..., f d ) = and (R + ) d [ 1 e s 1f 1 (θ) s d f d (θ) ] α(dθ)ρ d (ds 1,... ds d ) ν j (ds) = ρ d (ds 1,..., ds d ).

9 Compound Random Measures: Definition A compound random measure (CoRM) is a vector of CRM s with intensity ( ρ d (ds 1,..., ds d ) = z d s1 h z,..., s ) d ds 1... ds d ν (dz) z where s 1,..., s d are called scores. H is a score distribution with density h. ν is the Lévy intensity of a directing Lévy process. which satisfies the condition min(1, s )z d h ( s 1 z,..., s ) d z ν (dz) < where s is the Euclidean norm of the vector s = (s 1,..., s d ).

10 A representation of a CoRM Realizations of a CoRM can be expressed as µ j = m j,i J i δ θi i=1 where m 1,i,..., m d,i i.i.d. H η = i=1 J i δ θi is a CRM with Lévy intensity ν (ds) α(dθ).

11 CoRMs with independent gamma scores We will concentrate on the class of CoRMs for which h(s 1 /z,..., s d /z) = d f (s j /z) where f is the p.d.f. of a gamma distribution with shape φ, f (x) = 1 Γ(φ) x φ 1 exp{ x}. j=1

12 Properties of CoRMs with independent score distributions The Lévy copula can be expressed as a univariate integral. Let Mz(t) f = e ts z 1 f (s/z)ds be the moment generating function of z 1 f (s/z) then [1 ] ψ ρ,d (λ 1,..., λ d ) = e s 1 λ 1 s d λ d ρd (ds 1,... ds d ) (R + ) d = ψ ρ,d (λ 1,..., λ d ) = 1 d Mz( λ f j ) ν (z)dz This expression can be used to calculate quantities such as Corr( µ k (A), µ m (A)). j=1

13 CoRMs with gamma distributed scores Consider a CoRM process with independent Ga(φ, 1) distributed scores. If the CoRM process has gamma process marginals then ρ d (s 1,..., s d ) = ( d j=1 s j) φ 1 [Γ(φ)] d 1 s dφ+1 2 e s 2 W (d 2)φ+1 ( s ), dφ 2 2 (1) where s = s s d and W is the Whittaker function. If the CoRM process has σ-stable process marginals then ρ d (s 1,..., s d ) = ( d j=1 s j) φ 1 σγ(σ + dφ) [Γ(φ)] d 1 Γ(σ + φ)γ(1 σ) s σ dφ. (2)

14 CoRMs with exponentially distributed scores Consider a CoRM process with independent exponentially distributed scores. If the CoRM has gamma process marginals we recover the multivariate Lévy intensity of Leisen et al (213), ρ d (s 1,..., s d ) = d 1 j= (d 1)! (d 1 j)! s j 1 e s. Otherwise, if σ-stable marginals are considered then we recover the multivariate vector introduced in Leisen and Lijoi (211) and Zhu and Leisen (214), ρ d (s 1,..., s d ) = (σ) d Γ(1 σ) s σ d.

15 CoRMs with independent gamma scores: specific marginals The Lévy intensity of µ j ν j (ds) = z 1 f (s/z)ds ν (dz) = ν(ds). If we have independent gamma scores, the directing Lévy intensity ν is linked to the marginal Lévy intensity by ( ) ( ) 1 Γ(φ) ν = t 2 φ L 1 ν(s) (t) t sφ 1 where L 1 is the inverse Laplace transform.

16 CoRMs with independent gamma scores: specific marginals The intensity of the directing Lévy process is ν (z) = z 1 (1 z) φ 1, < z < 1 leads to a marginal gamma process for which ν(s) = s 1 exp{ s}, s > Remarks ν is the the Lévy intensity of a beta process. If ν is the Lévy intensity of a Stable-Beta process (Teh and Görür, 29), the marginal process is a generalized gamma process.

17 NCoRM: Gamma marginal, φ = 1 1 DLP Dim. 1 Dim. 2 Dim

18 NCoRM: Gamma marginal, φ = 1.1 DLP Dim. 1 Dim. 2 Dim

19 NCoRM: Gamma marginal, φ = 5 8 x 1 3 DLP Dim. 1 Dim. 2 Dim

20 Marginal beta process A CoRM with beta process marginals (ν(s) = βs 1 (1 s) β 1 ) can be constructed using A beta score distribution with parameters α and 1 A directing Lévy intensity ν (z) = βz 1 (1 z) β 1 + β(β 1) (1 z) β 2 α i.e. a superposition of a beta process and a compound Poisson process with beta jump distribution.

21 Links to other processes Other processes can be expressed as CoRM s: Superpositions/Thinning: e.g. Griffin et al (213), Chen et al (214), Lijoi and Nipoti (214), Lijoi et al (214a, b) using mixture score distributions h(s) = πδ s= + (1 π)h (s). Lévy copulae: e.g. Leisen and Lijoi (211), Leisen et al (213), Zhu and Leisen (214).

22 Normalized Compound Random Measures (NCoRM) A vector of random probability measures can be defined by normalizing each dimension of the CoRM so that p k = µ k µ k (Θ) = w (k) j δ θj. j=1

23 CoRMs on more general spaces For a more general space X, we define µ( ; x) to be a completely random measure for x X. The collection { µ( ; x) x X} can be given a CoRM prior with µ( ; x) = m j (x) J j δ θj j=1 where m k (x) is a realisation of a random process on X. Example X = R p, m k (x) = exp{r k (x)} where r k (x) is given a zero-mean Gaussian process prior (see Ranganath and Blei, 215).

24 Inference with NCoRM s We assume that the data are (x 1, y 1 ),..., (x n, y n ) and are modelled as y i ζ i ind. k(y i ζ i ), ζ i p( ; x i ) = µ( ; x i) µ(θ; x i ), i = 1, 2,..., n where k(y θ) is a probability density function for y with parameter θ and {p( ; x) x X} is given an NCoRM prior.

25 MCMC inference for infinite mixture models Introducing allocation variables c 1,..., c n, the posterior is proportional [ n ] J ci m ci (x i ) p(y, c m, J, θ) = k (y i θ ci ) l=1 J. l m l (x i ) i=1 This form is not tractable due to the infinite sum in the denominator of each term. This can be addressed using the identity { 1 } l=1 J l m l (x i ) = exp v i J l m l (x i ) dv i l=1

26 MCMC inference for infinite mixture models Introducing latent variables v i leads to a suitable form of augmented posterior for MCMC p(y, c, v m, J, θ) [ { n = k (y i θ ci ) J ci m ci (x i ) exp v i = i=1 K j=1 {i c i =j} k ( y i θ j ) J a j j {i c i =j} l=1 J l m l (x i ) m j (x i ) exp { }] l=1 J l } n v i m l (x i ) i=1 where there are K distinct values of c i and a j = n i=1 I(c i = j).

27 MCMC inference for infinite mixture models: Finite X, independent scores In this case, we can define a marginal sampler (e.g. Favaro and Teh, 213) by integrating over J and m. J a ν (J) dj is typical for marginal samplers of normalized random measure mixtures. Integrals of {i c i =j} m j(x i ) will be a product of moments of the scored distribution. E[exp { l=1 J l n i=1 v i m l (x i ) } ] can be evaluated either exactly or as a univariate integral.

28 MCMC inference for infinite mixture models: General X Pseudo-marginal methods (Andrieu and Roberts, 29) are useful for a target density of the form π(θ) f (θ) g(θ) where g(θ) cannot be directly evaluated. Samples from the target density ˆπ(θ) f (θ) ĝ(θ) where E[ĝ(θ)] = g(θ) will have the distribution π. In our target, the problem is evaluating E[exp { l=1 J l n i=1 v i m l (x i ) } ] = exp{ ψ(v)}

29 Unbiased estimation of the Laplace transform The Poisson estimator (see Papaspiliopoulos, 211) of L φ = exp { D φ(x) dx} is ˆL φ = K i=1 ( 1 φ(x ) i) a C κ(x i ) where κ is a p.d.f. on D, C > φ(x) κ(x) for x D, a > 1, i.i.d. K Pn(a C) and x i κ. Then, { } E[ˆL φ ] = exp φ(x) dx D and V[ˆL φ ] = L 2 φ ( { 1 φ(x) 2 } ) exp a C D κ(x) dx 1 <.

30 Unbiased estimation of exp{ ψ(v)} Assuming that x 1, x 2,..., x n are distinct, mi = m(x i ) and m = (m1,..., m n), exp{ ψ ρ,d (v)} can be re-expressed as { ( { }) } n n exp 1 exp z v i mi h(m ) ν (z) dz dm = (R + ) n where { L k = exp (R + ) n i=1 v k m k h(m ) exp { t k=1 L k } } n v i mi T ν (t) dt dm i=1 and T ν (t) = t ν (z) dz (tail mass function).

31 Unbiased estimation of the Laplace transform L k can be estimated using the Poisson estimator with x = (z, m k ), D = (, ) (R+ ) n and φ(z, m k ) = v k m k h(m k) exp { t } n v i mk T ν (t) <. i=1 A suitable approximating density is κ(z, m k ) = κ ν(z) m k h(m k ) E[m k ] where κ ν (z) > T ν (z) for all z R +.

32 A sampler for more general processes A pseudo-marginal sampler is used with exp{ ψ ρ,d (v)} estimated by the Poisson estimator. The jumps are not integrated out and values for empty clusters are proposed from h(m, J) h(m 1 /z,..., m K /z)z exp{ vz}ν (z). An interweaving scheme for m and z (Yu and Meng, 211).

33 Two clinical studies 3 CALGB CALGB β 1 1 β β β

34 Two clinical studies: Posterior mean densities Results using a CoRM with independent gamma scores. CALGB8881 CALGB β 1 1 β β β

35 Example: Nonparametric regression We consider the classic motorcycle data which records head acceleration at different times after impact. where w j = f (y) = exp{r k (x)}j k m=1 exp{rm(x)}jm w j (x)n(y µ j, σj 2 ) j=1 r m (x) are given independent Gaussian process prior with squared exponential covariance function. J 1, J 2,... follow a Gamma process with Lévy intensity M x 1 exp{ x}.

36

37 12 M 12? 4 L

38 Example: Nonparametric variable selection The classic Boston housing data record the median value of owner-occupied homes in 56 areas of Boston and the values of 14 attributes that are thought to effect house prices. The covariance function k(x, x ) = exp{ p i=1 w j(x j x j )2 } and p(w j ) (1 + w j ) 1.

39 CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Posterior median and 95% credible intervals for w j

40 Summary CoRM processes are a unifying framework for a wide-range of proposed vectors of CRMs. CoRM process are vectors of CRM s which are constructed in terms of a (univariate) CRM and a distribution (which defines the dependence). Several MCMC methods for NCoRM mixture models are developed. These include methods which depend on the availability of analytical forms for some integrals with respect to the score distribution and methods which do not. Modelling dependence through distributions allows a wide-range of dependent nonparametric models to be developed (e.g. regression, time series, etc.).

41 References C. Andrieu and G. O. Roberts (29). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist., 37, S. Favaro and Y. W. Teh (213). MCMC for Normalized Random Measure Mixture Models. Statistical Science, 28, J. E. Griffin, M. Kolossiatis and M. F. J. Steel (213). Comparing Distributions By Using Dependent Normalized Random-Measure Mixtures. Journal of the Royal Statistical Society, Series B, 75, F. Leisen and A. Lijoi (211). Vectors of Poisson-Dirichlet processes. J. Multivariate Anal., 12, F. Leisen, A. Lijoi and D. Spano (213). A Vector of Dirichlet processes. Electronic Journal of Statistics 7, A. Lijoi, and B. Nipoti (214), A class of hazard rate mixtures for combining survival data from different experiments. Journal of the American Statistical Association, 19, A. Lijoi, B. Nipoti and I. Prünster (214a), Bayesian inference with dependent normalized completely random measures, Bernoulli, 2, A. Lijoi, B. Nipoti and I. Prünster (214b), Dependent mixture models: clustering and borrowing information, Computational Statistics and Data Analysis, 71,

42 References O. Papaspiliopoulos (211). A methodological framework for Monte Carlo probabilistic inference for diffusion processes. In Bayesian Time Series Models (D. Barber, A. Taylan Cemgil and S. Chippia, Eds), Cambridge University Press. R. Ranganath and D. M. Blei (215). Correlated Random Measures. arxiv: Y. W. Teh and D. Görür (29). Indian Buffet Proceses with Power-law Behavior. In Advances in Neural Information Processing Systems 22 (Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams and A. Culotta, Eds.), Y. W. Teh, M. I. Jordan, M. J. Beal and D. M. Blei (26). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 11, Y. Yu and X.-L. Meng (211). To Center or Not to Center: That is Not the Question An Ancillarity-Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency. Journal of Computational and Graphical Statistics, 2, W. Zhu and F. Leisen (214). A multivariate extension of a vector of Poisson-Dirichlet processes. To appear in the Journal of Nonparametric Statistics.

Modelling and computation using NCoRM mixtures for. density regression

Modelling and computation using NCoRM mixtures for. density regression Modelling and computation using NCoRM mixtures for density regression arxiv:168.874v3 [stat.me] 31 Aug 217 Jim E. Griffin and Fabrizio Leisen University of Kent Abstract Normalized compound random measures

More information

Truncation error of a superposed gamma process in a decreasing order representation

Truncation error of a superposed gamma process in a decreasing order representation Truncation error of a superposed gamma process in a decreasing order representation B julyan.arbel@inria.fr Í www.julyanarbel.com Inria, Mistis, Grenoble, France Joint work with Igor Pru nster (Bocconi

More information

Dependent hierarchical processes for multi armed bandits

Dependent hierarchical processes for multi armed bandits Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino

More information

On the posterior structure of NRMI

On the posterior structure of NRMI On the posterior structure of NRMI Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with L.F. James and A. Lijoi Isaac Newton Institute, BNR Programme, 8th August 2007 Outline

More information

A Brief Overview of Nonparametric Bayesian Models

A Brief Overview of Nonparametric Bayesian Models A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers

More information

Truncation error of a superposed gamma process in a decreasing order representation

Truncation error of a superposed gamma process in a decreasing order representation Truncation error of a superposed gamma process in a decreasing order representation Julyan Arbel Inria Grenoble, Université Grenoble Alpes julyan.arbel@inria.fr Igor Prünster Bocconi University, Milan

More information

Normalized kernel-weighted random measures

Normalized kernel-weighted random measures Normalized kernel-weighted random measures Jim Griffin University of Kent 1 August 27 Outline 1 Introduction 2 Ornstein-Uhlenbeck DP 3 Generalisations Bayesian Density Regression We observe data (x 1,

More information

Bayesian nonparametric latent feature models

Bayesian nonparametric latent feature models Bayesian nonparametric latent feature models Indian Buffet process, beta process, and related models François Caron Department of Statistics, Oxford Applied Bayesian Statistics Summer School Como, Italy

More information

A marginal sampler for σ-stable Poisson-Kingman mixture models

A marginal sampler for σ-stable Poisson-Kingman mixture models A marginal sampler for σ-stable Poisson-Kingman mixture models joint work with Yee Whye Teh and Stefano Favaro María Lomelí Gatsby Unit, University College London Talk at the BNP 10 Raleigh, North Carolina

More information

Dependent Random Measures and Prediction

Dependent Random Measures and Prediction Dependent Random Measures and Prediction Igor Prünster University of Torino & Collegio Carlo Alberto 10th Bayesian Nonparametric Conference Raleigh, June 26, 2015 Joint wor with: Federico Camerlenghi,

More information

On the Truncation Error of a Superposed Gamma Process

On the Truncation Error of a Superposed Gamma Process On the Truncation Error of a Superposed Gamma Process Julyan Arbel and Igor Prünster Abstract Completely random measures (CRMs) form a key ingredient of a wealth of stochastic models, in particular in

More information

Slice Sampling Mixture Models

Slice Sampling Mixture Models Slice Sampling Mixture Models Maria Kalli, Jim E. Griffin & Stephen G. Walker Centre for Health Services Studies, University of Kent Institute of Mathematics, Statistics & Actuarial Science, University

More information

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Maria De Iorio Yale-NUS College Dept of Statistical Science, University College London Collaborators: Lifeng Ye (UCL, London,

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother

Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother J. E. Griffin and M. F. J. Steel University of Warwick Bayesian Nonparametric Modelling with the Dirichlet Process Regression

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

Bayesian Modeling of Conditional Distributions

Bayesian Modeling of Conditional Distributions Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings

More information

Slice sampling σ stable Poisson Kingman mixture models

Slice sampling σ stable Poisson Kingman mixture models ISSN 2279-9362 Slice sampling σ stable Poisson Kingman mixture models Stefano Favaro S.G. Walker No. 324 December 2013 www.carloalberto.org/research/working-papers 2013 by Stefano Favaro and S.G. Walker.

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Image segmentation combining Markov Random Fields and Dirichlet Processes

Image segmentation combining Markov Random Fields and Dirichlet Processes Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

Bayesian Nonparametric Models for Ranking Data

Bayesian Nonparametric Models for Ranking Data Bayesian Nonparametric Models for Ranking Data François Caron 1, Yee Whye Teh 1 and Brendan Murphy 2 1 Dept of Statistics, University of Oxford, UK 2 School of Mathematical Sciences, University College

More information

Bayesian nonparametric models for bipartite graphs

Bayesian nonparametric models for bipartite graphs Bayesian nonparametric models for bipartite graphs François Caron INRIA IMB - University of Bordeaux Talence, France Francois.Caron@inria.fr Abstract We develop a novel Bayesian nonparametric model for

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Dependent mixture models: clustering and borrowing information

Dependent mixture models: clustering and borrowing information ISSN 2279-9362 Dependent mixture models: clustering and borrowing information Antonio Lijoi Bernardo Nipoti Igor Pruenster No. 32 June 213 www.carloalberto.org/research/working-papers 213 by Antonio Lijoi,

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson

More information

Bayesian nonparametric models of sparse and exchangeable random graphs

Bayesian nonparametric models of sparse and exchangeable random graphs Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

GAUSSIAN PROCESS REGRESSION

GAUSSIAN PROCESS REGRESSION GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The

More information

A Nonparametric Model for Stationary Time Series

A Nonparametric Model for Stationary Time Series A Nonparametric Model for Stationary Time Series Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. isadora.antoniano@unibocconi.it Stephen G. Walker University of Texas at Austin, USA. s.g.walker@math.utexas.edu

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

Flexible Bayesian Nonparametric Priors and Bayesian Computational Methods

Flexible Bayesian Nonparametric Priors and Bayesian Computational Methods Universidad Carlos III de Madrid Repositorio institucional e-archivo Tesis http://e-archivo.uc3m.es Tesis Doctorales 2016-02-29 Flexible Bayesian Nonparametric Priors and Bayesian Computational Methods

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Variational Scoring of Graphical Model Structures

Variational Scoring of Graphical Model Structures Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

Bayesian Hidden Markov Models and Extensions

Bayesian Hidden Markov Models and Extensions Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Dirichlet Enhanced Latent Semantic Analysis

Dirichlet Enhanced Latent Semantic Analysis Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,

More information

The Variational Gaussian Approximation Revisited

The Variational Gaussian Approximation Revisited The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much

More information

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process

19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process 10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable

More information

Topic Modelling and Latent Dirichlet Allocation

Topic Modelling and Latent Dirichlet Allocation Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer

More information

Discussion of On simulation and properties of the stable law by L. Devroye and L. James

Discussion of On simulation and properties of the stable law by L. Devroye and L. James Stat Methods Appl (2014) 23:371 377 DOI 10.1007/s10260-014-0269-4 Discussion of On simulation and properties of the stable law by L. Devroye and L. James Antonio Lijoi Igor Prünster Accepted: 16 May 2014

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Construction of Dependent Dirichlet Processes based on Poisson Processes

Construction of Dependent Dirichlet Processes based on Poisson Processes 1 / 31 Construction of Dependent Dirichlet Processes based on Poisson Processes Dahua Lin Eric Grimson John Fisher CSAIL MIT NIPS 2010 Outstanding Student Paper Award Presented by Shouyuan Chen Outline

More information

Non-parametric Clustering with Dirichlet Processes

Non-parametric Clustering with Dirichlet Processes Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction

More information

Modeling conditional distributions with mixture models: Applications in finance and financial decision-making

Modeling conditional distributions with mixture models: Applications in finance and financial decision-making Modeling conditional distributions with mixture models: Applications in finance and financial decision-making John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Practical Bayesian Optimization of Machine Learning. Learning Algorithms Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal

More information

Asymptotics for posterior hazards

Asymptotics for posterior hazards Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January

More information

Bayesian Nonparametrics for Speech and Signal Processing

Bayesian Nonparametrics for Speech and Signal Processing Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer

More information

Priors for Random Count Matrices with Random or Fixed Row Sums

Priors for Random Count Matrices with Random or Fixed Row Sums Priors for Random Count Matrices with Random or Fixed Row Sums Mingyuan Zhou Joint work with Oscar Madrid and James Scott IROM Department, McCombs School of Business Department of Statistics and Data Sciences

More information

Distance dependent Chinese restaurant processes

Distance dependent Chinese restaurant processes David M. Blei Department of Computer Science, Princeton University 35 Olden St., Princeton, NJ 08540 Peter Frazier Department of Operations Research and Information Engineering, Cornell University 232

More information

Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources

Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical

More information

Learning Bayesian network : Given structure and completely observed data

Learning Bayesian network : Given structure and completely observed data Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution

More information

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Bayesian Nonparametric Learning of Complex Dynamical Phenomena

Bayesian Nonparametric Learning of Complex Dynamical Phenomena Duke University Department of Statistical Science Bayesian Nonparametric Learning of Complex Dynamical Phenomena Emily Fox Joint work with Erik Sudderth (Brown University), Michael Jordan (UC Berkeley),

More information

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,

More information

arxiv: v2 [math.st] 22 Apr 2016

arxiv: v2 [math.st] 22 Apr 2016 arxiv:1410.6843v2 [math.st] 22 Apr 2016 Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick Ashia C. Wilson Michael I. Jordan April 25, 2016 Abstract We demonstrate

More information

Linear Classification

Linear Classification Linear Classification Lili MOU moull12@sei.pku.edu.cn http://sei.pku.edu.cn/ moull12 23 April 2015 Outline Introduction Discriminant Functions Probabilistic Generative Models Probabilistic Discriminative

More information

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations

More information

Pseudo-marginal MCMC methods for inference in latent variable models

Pseudo-marginal MCMC methods for inference in latent variable models Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016

More information

arxiv: v1 [stat.ml] 20 Nov 2012

arxiv: v1 [stat.ml] 20 Nov 2012 A survey of non-exchangeable priors for Bayesian nonparametric models arxiv:1211.4798v1 [stat.ml] 20 Nov 2012 Nicholas J. Foti 1 and Sinead Williamson 2 1 Department of Computer Science, Dartmouth College

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

On Simulations form the Two-Parameter. Poisson-Dirichlet Process and the Normalized. Inverse-Gaussian Process

On Simulations form the Two-Parameter. Poisson-Dirichlet Process and the Normalized. Inverse-Gaussian Process On Simulations form the Two-Parameter arxiv:1209.5359v1 [stat.co] 24 Sep 2012 Poisson-Dirichlet Process and the Normalized Inverse-Gaussian Process Luai Al Labadi and Mahmoud Zarepour May 8, 2018 ABSTRACT

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

Statistical Approaches to Learning and Discovery

Statistical Approaches to Learning and Discovery Statistical Approaches to Learning and Discovery Bayesian Model Selection Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Single Index Quantile Regression for Heteroscedastic Data

Single Index Quantile Regression for Heteroscedastic Data Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR

More information

Hierarchical Modeling for Univariate Spatial Data

Hierarchical Modeling for Univariate Spatial Data Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This

More information

Nonparametric Bayesian modeling for dynamic ordinal regression relationships

Nonparametric Bayesian modeling for dynamic ordinal regression relationships Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria

More information

Simulating Random Variables

Simulating Random Variables Simulating Random Variables Timothy Hanson Department of Statistics, University of South Carolina Stat 740: Statistical Computing 1 / 23 R has many built-in random number generators... Beta, gamma (also

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing

More information

Two Useful Bounds for Variational Inference

Two Useful Bounds for Variational Inference Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the

More information

Stat 451 Lecture Notes Numerical Integration

Stat 451 Lecture Notes Numerical Integration Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics

Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics David B. Dahl August 5, 2008 Abstract Integration of several types of data is a burgeoning field.

More information

Sparse Bayesian Nonparametric Regression

Sparse Bayesian Nonparametric Regression François Caron caronfr@cs.ubc.ca Arnaud Doucet arnaud@cs.ubc.ca Departments of Computer Science and Statistics, University of British Columbia, Vancouver, Canada Abstract One of the most common problems

More information