Compound Random Measures
|
|
- Lucy Francis
- 5 years ago
- Views:
Transcription
1 Compound Random Measures Jim Griffin (joint work with Fabrizio Leisen) University of Kent
2 Introduction: Two clinical studies 3 CALGB CALGB β 1 1 β β β
3 Infinite mixture models This data could be analysed using two infinite mixture models CALGB8881: f 1 (y) = j=1 w (1) j k(y θ j ) and CALGB916: f 2 (y) = j=1 w (2) j k(y θ j ) where w (k) j > for j = 1, 2,... and j=1 w (k) j = 1 for k = 1, 2. k(y θ) is a p.d.f. for y with parameter θ. We need to put a prior on the w (1), w (2) and θ (random probability measure).
4 Some dependent random probability measures: stick-breaking θ are i.i.d. and w (k) j = V (k) j ( i<j 1 V (k) i ) Hierarchical Dirichlet ( Process (Teh et al, 26): Be (α β j, α 1 )) j l=1 β l, V (k) j β j Be(1, γ), j 1 β j = β j (1 β l ), ( ) Probit stick-breaking processes, etc.: V (1) j, V (2) j are ( ) correlated and independent of V (1) i, V (2) i for i j. l=1
5 Completely random measures µ is a completely random measure (CRM) on Θ if, for any disjoint subsets A 1,..., A n, µ(a 1 )..., µ(a n ) are mutually independent. We concentrate on completely random measures (CRM s) which can be represented in terms of jump sizes J i and jump locations θ i as µ = J i δ θi where δ is Dirac s delta function and have Lévy-Khintchine representation E [e ] f (θ) µ(dθ) = e [1 e sf (θ) ]α(dθ)ρ(ds) i=1 where α and ρ are measures for which α(dθ) <.
6 J Completely random measures Poisson process with intensity α(dθ)ρ(ds)
7 Examples of CRM s Many processes that we use in Bayesian nonparametrics are CRM s Gamma process - ρ(ds) = s 1 exp{ s} ds. Beta process - ρ(ds) = βs 1 (1 s) β 1 ds. or can be derived from CRM s Normalizing a Gamma process, i.e. taking p = µ/ µ(θ), leads to a Dirichlet process. A beta process prior for p 1, p 2,... can be used to define an Indian buffet process.
8 Vectors of CRMs It is useful to define d related CRM s. Suppose that µ 1,..., µ d are CRM s on Θ with marginal Lévy intensities ν j (ds, dθ) = ν j (ds)α(dθ) Then µ 1,..., µ d are a vector of CRM s if there is a Lévy-Khintchine representation of the form [ ] E e µ(f 1) µ d (f d ) = e ψ ρ,d (f 1,...,f d ) where ψρ,d(f 1,..., f d ) = and (R + ) d [ 1 e s 1f 1 (θ) s d f d (θ) ] α(dθ)ρ d (ds 1,... ds d ) ν j (ds) = ρ d (ds 1,..., ds d ).
9 Compound Random Measures: Definition A compound random measure (CoRM) is a vector of CRM s with intensity ( ρ d (ds 1,..., ds d ) = z d s1 h z,..., s ) d ds 1... ds d ν (dz) z where s 1,..., s d are called scores. H is a score distribution with density h. ν is the Lévy intensity of a directing Lévy process. which satisfies the condition min(1, s )z d h ( s 1 z,..., s ) d z ν (dz) < where s is the Euclidean norm of the vector s = (s 1,..., s d ).
10 A representation of a CoRM Realizations of a CoRM can be expressed as µ j = m j,i J i δ θi i=1 where m 1,i,..., m d,i i.i.d. H η = i=1 J i δ θi is a CRM with Lévy intensity ν (ds) α(dθ).
11 CoRMs with independent gamma scores We will concentrate on the class of CoRMs for which h(s 1 /z,..., s d /z) = d f (s j /z) where f is the p.d.f. of a gamma distribution with shape φ, f (x) = 1 Γ(φ) x φ 1 exp{ x}. j=1
12 Properties of CoRMs with independent score distributions The Lévy copula can be expressed as a univariate integral. Let Mz(t) f = e ts z 1 f (s/z)ds be the moment generating function of z 1 f (s/z) then [1 ] ψ ρ,d (λ 1,..., λ d ) = e s 1 λ 1 s d λ d ρd (ds 1,... ds d ) (R + ) d = ψ ρ,d (λ 1,..., λ d ) = 1 d Mz( λ f j ) ν (z)dz This expression can be used to calculate quantities such as Corr( µ k (A), µ m (A)). j=1
13 CoRMs with gamma distributed scores Consider a CoRM process with independent Ga(φ, 1) distributed scores. If the CoRM process has gamma process marginals then ρ d (s 1,..., s d ) = ( d j=1 s j) φ 1 [Γ(φ)] d 1 s dφ+1 2 e s 2 W (d 2)φ+1 ( s ), dφ 2 2 (1) where s = s s d and W is the Whittaker function. If the CoRM process has σ-stable process marginals then ρ d (s 1,..., s d ) = ( d j=1 s j) φ 1 σγ(σ + dφ) [Γ(φ)] d 1 Γ(σ + φ)γ(1 σ) s σ dφ. (2)
14 CoRMs with exponentially distributed scores Consider a CoRM process with independent exponentially distributed scores. If the CoRM has gamma process marginals we recover the multivariate Lévy intensity of Leisen et al (213), ρ d (s 1,..., s d ) = d 1 j= (d 1)! (d 1 j)! s j 1 e s. Otherwise, if σ-stable marginals are considered then we recover the multivariate vector introduced in Leisen and Lijoi (211) and Zhu and Leisen (214), ρ d (s 1,..., s d ) = (σ) d Γ(1 σ) s σ d.
15 CoRMs with independent gamma scores: specific marginals The Lévy intensity of µ j ν j (ds) = z 1 f (s/z)ds ν (dz) = ν(ds). If we have independent gamma scores, the directing Lévy intensity ν is linked to the marginal Lévy intensity by ( ) ( ) 1 Γ(φ) ν = t 2 φ L 1 ν(s) (t) t sφ 1 where L 1 is the inverse Laplace transform.
16 CoRMs with independent gamma scores: specific marginals The intensity of the directing Lévy process is ν (z) = z 1 (1 z) φ 1, < z < 1 leads to a marginal gamma process for which ν(s) = s 1 exp{ s}, s > Remarks ν is the the Lévy intensity of a beta process. If ν is the Lévy intensity of a Stable-Beta process (Teh and Görür, 29), the marginal process is a generalized gamma process.
17 NCoRM: Gamma marginal, φ = 1 1 DLP Dim. 1 Dim. 2 Dim
18 NCoRM: Gamma marginal, φ = 1.1 DLP Dim. 1 Dim. 2 Dim
19 NCoRM: Gamma marginal, φ = 5 8 x 1 3 DLP Dim. 1 Dim. 2 Dim
20 Marginal beta process A CoRM with beta process marginals (ν(s) = βs 1 (1 s) β 1 ) can be constructed using A beta score distribution with parameters α and 1 A directing Lévy intensity ν (z) = βz 1 (1 z) β 1 + β(β 1) (1 z) β 2 α i.e. a superposition of a beta process and a compound Poisson process with beta jump distribution.
21 Links to other processes Other processes can be expressed as CoRM s: Superpositions/Thinning: e.g. Griffin et al (213), Chen et al (214), Lijoi and Nipoti (214), Lijoi et al (214a, b) using mixture score distributions h(s) = πδ s= + (1 π)h (s). Lévy copulae: e.g. Leisen and Lijoi (211), Leisen et al (213), Zhu and Leisen (214).
22 Normalized Compound Random Measures (NCoRM) A vector of random probability measures can be defined by normalizing each dimension of the CoRM so that p k = µ k µ k (Θ) = w (k) j δ θj. j=1
23 CoRMs on more general spaces For a more general space X, we define µ( ; x) to be a completely random measure for x X. The collection { µ( ; x) x X} can be given a CoRM prior with µ( ; x) = m j (x) J j δ θj j=1 where m k (x) is a realisation of a random process on X. Example X = R p, m k (x) = exp{r k (x)} where r k (x) is given a zero-mean Gaussian process prior (see Ranganath and Blei, 215).
24 Inference with NCoRM s We assume that the data are (x 1, y 1 ),..., (x n, y n ) and are modelled as y i ζ i ind. k(y i ζ i ), ζ i p( ; x i ) = µ( ; x i) µ(θ; x i ), i = 1, 2,..., n where k(y θ) is a probability density function for y with parameter θ and {p( ; x) x X} is given an NCoRM prior.
25 MCMC inference for infinite mixture models Introducing allocation variables c 1,..., c n, the posterior is proportional [ n ] J ci m ci (x i ) p(y, c m, J, θ) = k (y i θ ci ) l=1 J. l m l (x i ) i=1 This form is not tractable due to the infinite sum in the denominator of each term. This can be addressed using the identity { 1 } l=1 J l m l (x i ) = exp v i J l m l (x i ) dv i l=1
26 MCMC inference for infinite mixture models Introducing latent variables v i leads to a suitable form of augmented posterior for MCMC p(y, c, v m, J, θ) [ { n = k (y i θ ci ) J ci m ci (x i ) exp v i = i=1 K j=1 {i c i =j} k ( y i θ j ) J a j j {i c i =j} l=1 J l m l (x i ) m j (x i ) exp { }] l=1 J l } n v i m l (x i ) i=1 where there are K distinct values of c i and a j = n i=1 I(c i = j).
27 MCMC inference for infinite mixture models: Finite X, independent scores In this case, we can define a marginal sampler (e.g. Favaro and Teh, 213) by integrating over J and m. J a ν (J) dj is typical for marginal samplers of normalized random measure mixtures. Integrals of {i c i =j} m j(x i ) will be a product of moments of the scored distribution. E[exp { l=1 J l n i=1 v i m l (x i ) } ] can be evaluated either exactly or as a univariate integral.
28 MCMC inference for infinite mixture models: General X Pseudo-marginal methods (Andrieu and Roberts, 29) are useful for a target density of the form π(θ) f (θ) g(θ) where g(θ) cannot be directly evaluated. Samples from the target density ˆπ(θ) f (θ) ĝ(θ) where E[ĝ(θ)] = g(θ) will have the distribution π. In our target, the problem is evaluating E[exp { l=1 J l n i=1 v i m l (x i ) } ] = exp{ ψ(v)}
29 Unbiased estimation of the Laplace transform The Poisson estimator (see Papaspiliopoulos, 211) of L φ = exp { D φ(x) dx} is ˆL φ = K i=1 ( 1 φ(x ) i) a C κ(x i ) where κ is a p.d.f. on D, C > φ(x) κ(x) for x D, a > 1, i.i.d. K Pn(a C) and x i κ. Then, { } E[ˆL φ ] = exp φ(x) dx D and V[ˆL φ ] = L 2 φ ( { 1 φ(x) 2 } ) exp a C D κ(x) dx 1 <.
30 Unbiased estimation of exp{ ψ(v)} Assuming that x 1, x 2,..., x n are distinct, mi = m(x i ) and m = (m1,..., m n), exp{ ψ ρ,d (v)} can be re-expressed as { ( { }) } n n exp 1 exp z v i mi h(m ) ν (z) dz dm = (R + ) n where { L k = exp (R + ) n i=1 v k m k h(m ) exp { t k=1 L k } } n v i mi T ν (t) dt dm i=1 and T ν (t) = t ν (z) dz (tail mass function).
31 Unbiased estimation of the Laplace transform L k can be estimated using the Poisson estimator with x = (z, m k ), D = (, ) (R+ ) n and φ(z, m k ) = v k m k h(m k) exp { t } n v i mk T ν (t) <. i=1 A suitable approximating density is κ(z, m k ) = κ ν(z) m k h(m k ) E[m k ] where κ ν (z) > T ν (z) for all z R +.
32 A sampler for more general processes A pseudo-marginal sampler is used with exp{ ψ ρ,d (v)} estimated by the Poisson estimator. The jumps are not integrated out and values for empty clusters are proposed from h(m, J) h(m 1 /z,..., m K /z)z exp{ vz}ν (z). An interweaving scheme for m and z (Yu and Meng, 211).
33 Two clinical studies 3 CALGB CALGB β 1 1 β β β
34 Two clinical studies: Posterior mean densities Results using a CoRM with independent gamma scores. CALGB8881 CALGB β 1 1 β β β
35 Example: Nonparametric regression We consider the classic motorcycle data which records head acceleration at different times after impact. where w j = f (y) = exp{r k (x)}j k m=1 exp{rm(x)}jm w j (x)n(y µ j, σj 2 ) j=1 r m (x) are given independent Gaussian process prior with squared exponential covariance function. J 1, J 2,... follow a Gamma process with Lévy intensity M x 1 exp{ x}.
36
37 12 M 12? 4 L
38 Example: Nonparametric variable selection The classic Boston housing data record the median value of owner-occupied homes in 56 areas of Boston and the values of 14 attributes that are thought to effect house prices. The covariance function k(x, x ) = exp{ p i=1 w j(x j x j )2 } and p(w j ) (1 + w j ) 1.
39 CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSTAT Posterior median and 95% credible intervals for w j
40 Summary CoRM processes are a unifying framework for a wide-range of proposed vectors of CRMs. CoRM process are vectors of CRM s which are constructed in terms of a (univariate) CRM and a distribution (which defines the dependence). Several MCMC methods for NCoRM mixture models are developed. These include methods which depend on the availability of analytical forms for some integrals with respect to the score distribution and methods which do not. Modelling dependence through distributions allows a wide-range of dependent nonparametric models to be developed (e.g. regression, time series, etc.).
41 References C. Andrieu and G. O. Roberts (29). The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Statist., 37, S. Favaro and Y. W. Teh (213). MCMC for Normalized Random Measure Mixture Models. Statistical Science, 28, J. E. Griffin, M. Kolossiatis and M. F. J. Steel (213). Comparing Distributions By Using Dependent Normalized Random-Measure Mixtures. Journal of the Royal Statistical Society, Series B, 75, F. Leisen and A. Lijoi (211). Vectors of Poisson-Dirichlet processes. J. Multivariate Anal., 12, F. Leisen, A. Lijoi and D. Spano (213). A Vector of Dirichlet processes. Electronic Journal of Statistics 7, A. Lijoi, and B. Nipoti (214), A class of hazard rate mixtures for combining survival data from different experiments. Journal of the American Statistical Association, 19, A. Lijoi, B. Nipoti and I. Prünster (214a), Bayesian inference with dependent normalized completely random measures, Bernoulli, 2, A. Lijoi, B. Nipoti and I. Prünster (214b), Dependent mixture models: clustering and borrowing information, Computational Statistics and Data Analysis, 71,
42 References O. Papaspiliopoulos (211). A methodological framework for Monte Carlo probabilistic inference for diffusion processes. In Bayesian Time Series Models (D. Barber, A. Taylan Cemgil and S. Chippia, Eds), Cambridge University Press. R. Ranganath and D. M. Blei (215). Correlated Random Measures. arxiv: Y. W. Teh and D. Görür (29). Indian Buffet Proceses with Power-law Behavior. In Advances in Neural Information Processing Systems 22 (Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams and A. Culotta, Eds.), Y. W. Teh, M. I. Jordan, M. J. Beal and D. M. Blei (26). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 11, Y. Yu and X.-L. Meng (211). To Center or Not to Center: That is Not the Question An Ancillarity-Sufficiency Interweaving Strategy (ASIS) for Boosting MCMC Efficiency. Journal of Computational and Graphical Statistics, 2, W. Zhu and F. Leisen (214). A multivariate extension of a vector of Poisson-Dirichlet processes. To appear in the Journal of Nonparametric Statistics.
Modelling and computation using NCoRM mixtures for. density regression
Modelling and computation using NCoRM mixtures for density regression arxiv:168.874v3 [stat.me] 31 Aug 217 Jim E. Griffin and Fabrizio Leisen University of Kent Abstract Normalized compound random measures
More informationTruncation error of a superposed gamma process in a decreasing order representation
Truncation error of a superposed gamma process in a decreasing order representation B julyan.arbel@inria.fr Í www.julyanarbel.com Inria, Mistis, Grenoble, France Joint work with Igor Pru nster (Bocconi
More informationDependent hierarchical processes for multi armed bandits
Dependent hierarchical processes for multi armed bandits Federico Camerlenghi University of Bologna, BIDSA & Collegio Carlo Alberto First Italian meeting on Probability and Mathematical Statistics, Torino
More informationOn the posterior structure of NRMI
On the posterior structure of NRMI Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with L.F. James and A. Lijoi Isaac Newton Institute, BNR Programme, 8th August 2007 Outline
More informationA Brief Overview of Nonparametric Bayesian Models
A Brief Overview of Nonparametric Bayesian Models Eurandom Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin Also at Machine
More informationBayesian nonparametric models for bipartite graphs
Bayesian nonparametric models for bipartite graphs François Caron Department of Statistics, Oxford Statistics Colloquium, Harvard University November 11, 2013 F. Caron 1 / 27 Bipartite networks Readers/Customers
More informationTruncation error of a superposed gamma process in a decreasing order representation
Truncation error of a superposed gamma process in a decreasing order representation Julyan Arbel Inria Grenoble, Université Grenoble Alpes julyan.arbel@inria.fr Igor Prünster Bocconi University, Milan
More informationNormalized kernel-weighted random measures
Normalized kernel-weighted random measures Jim Griffin University of Kent 1 August 27 Outline 1 Introduction 2 Ornstein-Uhlenbeck DP 3 Generalisations Bayesian Density Regression We observe data (x 1,
More informationBayesian nonparametric latent feature models
Bayesian nonparametric latent feature models Indian Buffet process, beta process, and related models François Caron Department of Statistics, Oxford Applied Bayesian Statistics Summer School Como, Italy
More informationA marginal sampler for σ-stable Poisson-Kingman mixture models
A marginal sampler for σ-stable Poisson-Kingman mixture models joint work with Yee Whye Teh and Stefano Favaro María Lomelí Gatsby Unit, University College London Talk at the BNP 10 Raleigh, North Carolina
More informationDependent Random Measures and Prediction
Dependent Random Measures and Prediction Igor Prünster University of Torino & Collegio Carlo Alberto 10th Bayesian Nonparametric Conference Raleigh, June 26, 2015 Joint wor with: Federico Camerlenghi,
More informationOn the Truncation Error of a Superposed Gamma Process
On the Truncation Error of a Superposed Gamma Process Julyan Arbel and Igor Prünster Abstract Completely random measures (CRMs) form a key ingredient of a wealth of stochastic models, in particular in
More informationSlice Sampling Mixture Models
Slice Sampling Mixture Models Maria Kalli, Jim E. Griffin & Stephen G. Walker Centre for Health Services Studies, University of Kent Institute of Mathematics, Statistics & Actuarial Science, University
More informationBayesian Nonparametric Autoregressive Models via Latent Variable Representation
Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Maria De Iorio Yale-NUS College Dept of Statistical Science, University College London Collaborators: Lifeng Ye (UCL, London,
More informationOutline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution
Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model
More informationBayesian Nonparametrics: Dirichlet Process
Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian
More informationBayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother
Bayesian Nonparametric Modelling with the Dirichlet Process Regression Smoother J. E. Griffin and M. F. J. Steel University of Warwick Bayesian Nonparametric Modelling with the Dirichlet Process Regression
More informationVariational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures
17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter
More informationBayesian Modeling of Conditional Distributions
Bayesian Modeling of Conditional Distributions John Geweke University of Iowa Indiana University Department of Economics February 27, 2007 Outline Motivation Model description Methods of inference Earnings
More informationSlice sampling σ stable Poisson Kingman mixture models
ISSN 2279-9362 Slice sampling σ stable Poisson Kingman mixture models Stefano Favaro S.G. Walker No. 324 December 2013 www.carloalberto.org/research/working-papers 2013 by Stefano Favaro and S.G. Walker.
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationImage segmentation combining Markov Random Fields and Dirichlet Processes
Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO
More informationBayesian non-parametric model to longitudinally predict churn
Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics
More informationSTAT Advanced Bayesian Inference
1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationOn Reparametrization and the Gibbs Sampler
On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department
More informationBayesian Nonparametric Models for Ranking Data
Bayesian Nonparametric Models for Ranking Data François Caron 1, Yee Whye Teh 1 and Brendan Murphy 2 1 Dept of Statistics, University of Oxford, UK 2 School of Mathematical Sciences, University College
More informationBayesian nonparametric models for bipartite graphs
Bayesian nonparametric models for bipartite graphs François Caron INRIA IMB - University of Bordeaux Talence, France Francois.Caron@inria.fr Abstract We develop a novel Bayesian nonparametric model for
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationDependent mixture models: clustering and borrowing information
ISSN 2279-9362 Dependent mixture models: clustering and borrowing information Antonio Lijoi Bernardo Nipoti Igor Pruenster No. 32 June 213 www.carloalberto.org/research/working-papers 213 by Antonio Lijoi,
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Pierpaolo De Blasi University of Turin 10th August 2007, BNR Workshop, Isaac Newton Intitute, Cambridge, UK Joint work with Giovanni Peccati (Université Paris VI) and
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson
More informationBayesian nonparametric models of sparse and exchangeable random graphs
Bayesian nonparametric models of sparse and exchangeable random graphs F. Caron & E. Fox Technical Report Discussion led by Esther Salazar Duke University May 16, 2014 (Reading group) May 16, 2014 1 /
More informationNonparametric Bayesian Methods - Lecture I
Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics
More informationLecture 3a: Dirichlet processes
Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationA Nonparametric Model for Stationary Time Series
A Nonparametric Model for Stationary Time Series Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. isadora.antoniano@unibocconi.it Stephen G. Walker University of Texas at Austin, USA. s.g.walker@math.utexas.edu
More informationBayesian Nonparametric Models
Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior
More informationDirichlet Processes: Tutorial and Practical Course
Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS
More informationA Review of Pseudo-Marginal Markov Chain Monte Carlo
A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the
More informationFlexible Bayesian Nonparametric Priors and Bayesian Computational Methods
Universidad Carlos III de Madrid Repositorio institucional e-archivo Tesis http://e-archivo.uc3m.es Tesis Doctorales 2016-02-29 Flexible Bayesian Nonparametric Priors and Bayesian Computational Methods
More informationIntegrated Non-Factorized Variational Inference
Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationVariational Scoring of Graphical Model Structures
Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationMCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17
MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making
More informationBayesian Hidden Markov Models and Extensions
Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling
More informationA Fully Nonparametric Modeling Approach to. BNP Binary Regression
A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationDirichlet Enhanced Latent Semantic Analysis
Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,
More informationThe Variational Gaussian Approximation Revisited
The Variational Gaussian Approximation Revisited Manfred Opper Cédric Archambeau March 16, 2009 Abstract The variational approximation of posterior distributions by multivariate Gaussians has been much
More information19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process
10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable
More informationTopic Modelling and Latent Dirichlet Allocation
Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer
More informationDiscussion of On simulation and properties of the stable law by L. Devroye and L. James
Stat Methods Appl (2014) 23:371 377 DOI 10.1007/s10260-014-0269-4 Discussion of On simulation and properties of the stable law by L. Devroye and L. James Antonio Lijoi Igor Prünster Accepted: 16 May 2014
More informationBayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference
1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationConstruction of Dependent Dirichlet Processes based on Poisson Processes
1 / 31 Construction of Dependent Dirichlet Processes based on Poisson Processes Dahua Lin Eric Grimson John Fisher CSAIL MIT NIPS 2010 Outstanding Student Paper Award Presented by Shouyuan Chen Outline
More informationNon-parametric Clustering with Dirichlet Processes
Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction
More informationModeling conditional distributions with mixture models: Applications in finance and financial decision-making
Modeling conditional distributions with mixture models: Applications in finance and financial decision-making John Geweke University of Iowa, USA Journal of Applied Econometrics Invited Lecture Università
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationPractical Bayesian Optimization of Machine Learning. Learning Algorithms
Practical Bayesian Optimization of Machine Learning Algorithms CS 294 University of California, Berkeley Tuesday, April 20, 2016 Motivation Machine Learning Algorithms (MLA s) have hyperparameters that
More informationBayes methods for categorical data. April 25, 2017
Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,
More informationStat 542: Item Response Theory Modeling Using The Extended Rank Likelihood
Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood Jonathan Gruhl March 18, 2010 1 Introduction Researchers commonly apply item response theory (IRT) models to binary and ordinal
More informationAsymptotics for posterior hazards
Asymptotics for posterior hazards Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with P. Di Biasi and G. Peccati Workshop on Limit Theorems and Applications Paris, 16th January
More informationBayesian Nonparametrics for Speech and Signal Processing
Bayesian Nonparametrics for Speech and Signal Processing Michael I. Jordan University of California, Berkeley June 28, 2011 Acknowledgments: Emily Fox, Erik Sudderth, Yee Whye Teh, and Romain Thibaux Computer
More informationPriors for Random Count Matrices with Random or Fixed Row Sums
Priors for Random Count Matrices with Random or Fixed Row Sums Mingyuan Zhou Joint work with Oscar Madrid and James Scott IROM Department, McCombs School of Business Department of Statistics and Data Sciences
More informationDistance dependent Chinese restaurant processes
David M. Blei Department of Computer Science, Princeton University 35 Olden St., Princeton, NJ 08540 Peter Frazier Department of Operations Research and Information Engineering, Cornell University 232
More informationNon-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources
th International Conference on Information Fusion Chicago, Illinois, USA, July -8, Non-parametric Bayesian Modeling and Fusion of Spatio-temporal Information Sources Priyadip Ray Department of Electrical
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationLecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu
Lecture: Gaussian Process Regression STAT 6474 Instructor: Hongxiao Zhu Motivation Reference: Marc Deisenroth s tutorial on Robot Learning. 2 Fast Learning for Autonomous Robots with Gaussian Processes
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationBayesian Nonparametric Learning of Complex Dynamical Phenomena
Duke University Department of Statistical Science Bayesian Nonparametric Learning of Complex Dynamical Phenomena Emily Fox Joint work with Erik Sudderth (Brown University), Michael Jordan (UC Berkeley),
More informationStable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence
Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham NC 778-5 - Revised April,
More informationarxiv: v2 [math.st] 22 Apr 2016
arxiv:1410.6843v2 [math.st] 22 Apr 2016 Posteriors, conjugacy, and exponential families for completely random measures Tamara Broderick Ashia C. Wilson Michael I. Jordan April 25, 2016 Abstract We demonstrate
More informationLinear Classification
Linear Classification Lili MOU moull12@sei.pku.edu.cn http://sei.pku.edu.cn/ moull12 23 April 2015 Outline Introduction Discriminant Functions Probabilistic Generative Models Probabilistic Discriminative
More informationInfinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix
Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations
More informationPseudo-marginal MCMC methods for inference in latent variable models
Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016
More informationarxiv: v1 [stat.ml] 20 Nov 2012
A survey of non-exchangeable priors for Bayesian nonparametric models arxiv:1211.4798v1 [stat.ml] 20 Nov 2012 Nicholas J. Foti 1 and Sinead Williamson 2 1 Department of Computer Science, Dartmouth College
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationOn Simulations form the Two-Parameter. Poisson-Dirichlet Process and the Normalized. Inverse-Gaussian Process
On Simulations form the Two-Parameter arxiv:1209.5359v1 [stat.co] 24 Sep 2012 Poisson-Dirichlet Process and the Normalized Inverse-Gaussian Process Luai Al Labadi and Mahmoud Zarepour May 8, 2018 ABSTRACT
More informationBayesian Nonparametrics
Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent
More informationStatistical Approaches to Learning and Discovery
Statistical Approaches to Learning and Discovery Bayesian Model Selection Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationSingle Index Quantile Regression for Heteroscedastic Data
Single Index Quantile Regression for Heteroscedastic Data E. Christou M. G. Akritas Department of Statistics The Pennsylvania State University SMAC, November 6, 2015 E. Christou, M. G. Akritas (PSU) SIQR
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationNonparametric Bayesian modeling for dynamic ordinal regression relationships
Nonparametric Bayesian modeling for dynamic ordinal regression relationships Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz Joint work with Maria
More informationSimulating Random Variables
Simulating Random Variables Timothy Hanson Department of Statistics, University of South Carolina Stat 740: Statistical Computing 1 / 23 R has many built-in random number generators... Beta, gamma (also
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationTwo Useful Bounds for Variational Inference
Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the
More informationStat 451 Lecture Notes Numerical Integration
Stat 451 Lecture Notes 03 12 Numerical Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 5 in Givens & Hoeting, and Chapters 4 & 18 of Lange 2 Updated: February 11, 2016 1 / 29
More informationMotivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University
Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More informationDistance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics
Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics David B. Dahl August 5, 2008 Abstract Integration of several types of data is a burgeoning field.
More informationSparse Bayesian Nonparametric Regression
François Caron caronfr@cs.ubc.ca Arnaud Doucet arnaud@cs.ubc.ca Departments of Computer Science and Statistics, University of British Columbia, Vancouver, Canada Abstract One of the most common problems
More information