Particle Learning for General Mixtures

Size: px
Start display at page:

Download "Particle Learning for General Mixtures"

Transcription

1 Particle Learning for General Mixtures Hedibert Freitas Lopes 1 Booth School of Business University of Chicago Dipartimento di Scienze delle Decisioni Università Bocconi, Milano 1 Joint work with Nicholas Polson, Carlos Carvalho and Matt Taddy

2 Contributions 1. Efficiently and sequentially learn from general mixture models f (z) = k(z; θ)dg(θ) where G is almost surely discrete. 2. When we say general, we mean general Finite mixture models; Dirichlet process mixture models; Indian buffet processes; Probit stick-breaking models. 3. An alternative to MCMC On-line model fitting and marginal likelihood estimation; Posterior cluster allocation; Handling high dimensional data-sets.

3 Particle Learning (PL) 1. General framework of sequential parameter learning; 2. Practical alternative to MCMC 2 ; 3. Sequential Monte assessment; 4. Resample-sample is key 3 ; 5. Essential state vector generalizes sufficient statistics 4 ; 6. Connection to Rao-Blackwellization 5 ; 7. PL is not sequential importance sampling; 8. Smoothing offline. 2 Chen and Liu (1999). 3 Pitt and Shephard (1999). 4 Storvik (2002) and Fearnhead (2002) 5 Kong, Liu and Wong (1994).

4 The general PL algorithm Posterior at t: Φ t { (x t, θ) (i)} N i=1 p(x t, θ y t ). Compute, for i = 1,..., N, w (i) t+1 p(y t+1 x (i) t, θ (i) ) Resample from Φ t with weights w t+1 : Φ t Propagate states Update sufficient statistics Sample parameters x (i) t+1 p(x t+1 x (i) t, θ (i), y t+1 ) s (i) t+1 = S(s(i) t, x (i) t+1, y t+1) θ (i) p(θ s (i) t+1 ) { ( x t, θ) (i) } N i=1.

5 Example: Nonlinear, nonlinear dynamic model Heavy-tail observation errors can be introduced as follows: y t+1 = x t+1 + σ λ t+1 ɛ t+1 x t x t+1 = β 1 + xt 2 + τu t+1 where λ t+1 IG(ν/2, ν/2) and ɛ t+1 and u t+1 are N(0,1) and ν is known. The observation error term is non-normal λ t+1 ɛ t+1 t ν.

6 Sequential inference

7 Example: Dynamic multinomial logit model Let us study the multinomial logit model P (y t+1 = 1 β t+1 ) = eftβt 1 + e Ftβt and β t+1 = φβ t + σ x ɛ β t+1 where β 0 N(0, σ 2 /(1 ρ 2 )). Scott s (2007) data augmentation structure leads to a mixture Kalman filter model y t+1 = I(z t 0) z t+1 = Z t β + ɛ t+1 where ɛ t+1 lne(1) Here ɛ t is an extreme value distribution of type 1 where E(1) is an exponential of mean one. The key is that it is easily to simulate p(z t β, y t ) using ( ln Ui z t+1 = ln 1 + e β i β ln V ) i e β i β I y t+1 =0

8 10-component mixture of normals Frunwirth-Schnatter and Schnatter (2007) uses a 10-component mixture of normals: p(ɛ t ) = e ɛt e ɛ t 10 w j N (µ j, sj 2 ) j=1 Hence conditional on an indicator λ t we can analyze y t = I(z t 0) and z t = µ λt + Z t β + s λt ɛ t where ɛ t N(0, 1) and Pr(λ t = j) = w j. Also, ) s β t+1 = K (st β, z t+1, λ t+1, θ, y t+1 p(y t+1 s β t, θ) = λ t+1 p(y t+1 s β t, λ t+1, θ) for re-sampling. Propagation now requires λ t+1 p λ t+1 (s β t, θ)k(i), y t+1 z t+1 p z t+1 (s β t, θ)k(i), λ t+1, y t+1 β t+1 p z t+1 (s β, θ) k(i), λ t+1, z t+1

9 Simulated exercise Y t 3 β t Time Time 5 α 0.4 σ Time Time PL based on 30,000 particles.

10 Example: Sequential Bayesian Lasso We develop a sequential version of Bayesian Lasso 6 for a simple problem of signal detection. The model takes the form (y t θ t ) N(θ t, 1) p(θ t τ) = (2τ) 1 exp ( θ t /τ) for t = 1,..., n and τ 2 IG(a 0, b 0 ). Data augmentation: It is easy to see that p(θ t τ) = p(θ t τ, λ t )p(λ t )dλ t where 6 Hans (2008) λ t Exp(2) θ t τ, λ t N(0, τ 2 λ t )

11 Data augmentation The natural set of latent variables is given by the augmentation variable λ n+1 and conditional sufficient statistics leading to Z n = (λ n+1, a n, b n ) The sequence of variables λ n+1 are i.i.d. and so can be propagated directly with p(λ n+1 ). The conditional sufficient statistics (a n+1, b n+1 ) are deterministically determined based on parameters (θ n+1, λ n+1 ) and previous values (a n, b n ).

12 PL algorithm 1. After n observations: { (Z n, τ) (i)} N i=1. 2. Draw λ (i) n+1 Exp(2). 3. Resample old particles with weights 4. Sample θ (i) C 1 n w (i) n+1 p(y n+1; 0, 1+τ 2(i) λ (i) n+1 ). n+1 N(m(i) n 1(i) 2(i) λ n+1. = 1 + τ n, C (i) ), where m n (i) 5. Suff. stats: a (i) n+1 = ã(i) n + 1/2, b (i) n+1 = b n (i) 6. Sample (offline) τ 2(i) IG(a n+1, b n+1 ). 7. Let Z (i) n+1 = (λ(i) n+1, a(i) n+1, b(i) n+1 ). 8. After n + 1 observations: { (Z n+1, τ) (i)} N i=1. = C n (i) y n+1 and + θ 2(i) (i) n+1 /(2 λ n+1 ).

13 Sequential Bayes factor As the Lasso is a model for sparsity we would expect the evidence for it to increase when we observe y t = 0. We can sequentially estimate p(y n+1 y n, lasso) via p(y n+1 y n, lasso) = 1 N N p(y n+1 (λ n, τ) (i) ) i=1 with predictive p(y n+1 λ n, τ) N(0, τ 2 λ n + 1). This leads to a sequential Bayes factor BF n+1 = p(y n+1 lasso) p(y n+1 normal).

14 Simulated data Data based on θ = (0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1) and priors τ 2 IG(2, 1) for the double exponential case and τ 2 IG(2, 3) for the normal case, reflecting the ratio of variances between those two distributions. Bayes factor Observations

15 PL for finite mixture models 1. Resample: Generate an index ζ MN(ω, N) where p y t+1 (s t, n t) (i) ω(i) = P N i=1 p (yt+1 (st, nt)(i) ) 2. Propagate: k t+1 p k t+1 (s t, n t) ζ(i), y t+1 s t+1 = S s ζ(i) t, k t+1, y t+1 n t+1,kt+1 = n ζ(i) t,k t+1 + 1, n t+1,j = n ζ(i) t,j for j k t+1 3. Learn: p(p, θ y t ) = 1 N NX i=1 p p, θ (s t, n t) (i)

16 Finite mixture of Poisson Model: an m component mixture of Poisson densities p(y t ) = m p j Po(y t ; θj ). i=1 Prior: π(θ j ) = ga(α j, β j ) for j = 1,..., m π(p) Dir(γ). The form of the conditional posterior given y t, given the latent allocation k t, is completely defined by n t, the number of samples in each component, and sufficient statistics s t = (s t,1,..., s t,m ), where s t,j = t r=1 y r 1 [kr =j].

17 Resample-propagate Resample: p(y t+1 s t, n t ) = = mx Z Z p j p(y t+1 θ j )p(θ, p)d(θ, p) k t+1 =j=1 mx Γ `s t,j + y t+1 + α j st,j +α `βj + n j t,j Γ `s 1 k t+1 =j=1 t,j + α j `βj + n t,j + 1 s t,j +y t+1 +α j y t+1! γ j + n t,j P mi=1 γ i + n t,i!. Propagate: p(k t+1 = j s t, n t, y t+1 ) Γ `s st,j +α t,j + y t+1 + α j `βj Γ `s + n j t,j t,j + α j `βj + n t,j + 1 s t,j +y t+1 +α j! γ j + n t,j P. mi=1 γ j + n t,i Given k t+1, s t+1,j = s t,j + y t+1 1 {kt+1 =j} n t+1,j = n t,j + 1 {kt+1 =j} for j = 1,..., m.

18 PL and MCMC Left: data from a m = 4 mixture of Poisson. Central: PL (red), MCMC (Blue), TRUE (green). Right: MC study of BF(m=3;m=4). Figure 1: Poisson Mixture Example. Data, Densities, and Bayes Factors.

19 Dirichlet Process (DP) mixtures The DP mixtures is the most commonly used nonparametric prior for random mixture models. Constructive definition: a random distribution G generated from DP(α, G 0 (ψ)) is almost surely of the form dg( ) = p l δ ϑl ( ) (1) l=1 with ϑ l iid G0 (ϑ l ; ψ), p l = (1 l 1 j=1 p j)v l and v l iid beta(1, α), for l = 1, 2,..., centering distribution G 0 (ϑ; ψ) and independent sequences {ϑ l } l=1 and {v k} k=1. The discreteness of DP realizations is explicit in this definition.

20 PL for DP mixture models 1. Resample: Generate an index ζ MN(ω, N) where p y t+1 (s t, n t, m t) (i) ω(i) = P N i=1 p (yt+1 (st, nt, mt)(i) ) 2. Propagate: kt+1 p ( k t+1 (s t, n t, m t ) ζ(i), y t+1 ), For j kt+1, n t+1,j = n t,j. If kt+1 m t, n t+1,kt = n t,kt + 1 and m t+1 = m t. Otherwise, m t+1 = m t + 1 and n t,mt+1 = Estimation: p `E[f (y; G)] y t = 1 N NX i=1 p y (s t, n t, m t) (i)

21 DP mixture of multivariate normals The d-dimensional DP multivariate normal mixture (DP-MVN) model has density function f (y t ; G) = N(y t µ t, Σ t )dg(µ t, Σ t ) and G DP(α, G 0 (µ, Σ)), with conjugate centering distribution G 0 = N(µ; λ, Σ/κ)W (Σ 1 ; ν, Ω), where W (Σ 1 ; ν, Ω) denotes a Wishart distribution such that E[Σ 1 ] = νω 1 and E[Σ] = (ν (d + 1)/2) 1 Ω.

22 Conditional sufficient statistics For each unique mixture component (the s t,j ) are and r:k r =j ȳ t,j = r:k r =j y r /n t,j S t,j = (y r ȳ t,j )(y r ȳ t,j ) = y r y r n t,j ȳ t,j ȳ t,j. r:k r =j The initial sufficient statistics are n 1 = 1 and s 1 = {y 1, 0}, such that the algorithm is populated with N identical particles. Conditional on existing particles {(n t, s t ) i } N i=1, uncertainty is updated through the familiar resample/propagate approach.

23 Resample The predictive probability function for resampling is p(y t+1 s t, n t, m t + 1) = + α α + t St(y t+1; a 0, B 0, c 0 ) m t n t,j α + t St (y t+1; a t,j, B t,j, c t,j ) j=1 where the Student s t distributions are parametrized by a 0 = λ, B 0 = 2(κ+1) κc 0 Ω, c 0 = 2ν d + 1, a t,j = κλ+n t,j ȳ t,j κ+n t,j, B t,j = 2(κ+n t,j +1) [ (κ+n t,j )c t,j Ω + 1 D t,j = S t,j + 2 D t,j], ct,j = 2ν + n t,j d + 1, and κn t,j (κ+n t,j ) (λ ȳ t,j)(λ ȳ t,j ).

24 Propagate Sample the component state k t+1 such that, p(k t+1 = j) n t,j α + t St(y t+1; a t,j, B t,j, c t,j ) j = 1,..., m t p(k t+1 = m t + 1) α α + t St(y t+1; a 0, B 0, c 0 ). If k t+1 = m t + 1, the new sufficient statistics are defined by m t+1 = m t + 1 and s t+1,mt+1 = [y t+1, 0]. If k t+1 = j, n t+1,j = n t,j + 1 and we update s t+1,j such that ȳ t+1 = (n t,j ȳ t,j + y t+1 )/n t+1,j and S t+1,j = S t,j + y t+1 y t+1 + n t,jȳ t,j ȳt,j i n t+1,jȳ t+1,j ȳt+1,j i.

25 Parameter update Assuming a W(γ Ω, Ψ 1 Ω ) prior for Ω and a N(γ λ, Ψ λ ) prior for λ, the sample at time t is augmented with draws for the auxiliary variables {µ j, Σ j }, for j = 1,..., m t, from their posterior full conditionals, p(µ j, Σ j s t, n t ) = N(µ j ; a t,j, The parameter updates are then λ N R(γ λ Ψ 1 λ 1 κ + n t,j Σ j )W(Σ 1 j ; ν+n t,j, Ω+D t,j ). mt + κ j=1 Ω W ( γ Ω + m t ν, R 1), where R 1 = m t j=1 Σ 1 j + Ψ 1 Ω. Σ 1 j µ j ), R Similarly, if α is assigned the usual gamma hyperprior, it can be updated for each particle using the auxiliary variable method from Escobar and West (1995).

26 Illustration Four datasets: d = 2, 5, 10 and d = 25. Sample size: t = 1,..., T = 500d. The d-dimensional y t was generated from a N(µ t, AR(0.9)) density, where AR(0.9) denotes the correlation matrix implied by an autoregressive process of lag one and correlation 0.9. µ t ind G µ, where G µ is the realization of a DP(4, N(0, 4I)) process. Thus the simulated data is clustered around a set of distinct means, and highly correlated within each cluster.

27 Dimension d = 2 Data and density estimates for PL fit with 1000 particles (left) and each of ten PL fits with 500 particles (right). A random ordering of the 1000 observations z1 z

28 PL and Gibbs sampler Left: average log posterior predictive score for validation sets of 100 observations. Right: posterior averages for m T, the total number of allocated mixture components. Boxplots of the distribution over ten repetitions of the algorithm. Red boxplots correspond to PL, and the blue correspond to Gibbs. Average left out log probability Average number of components

29 Dimension d = 25 Data and marginal density estimates. Curves are posterior means of ten PL fits, 500 particles, random ordering of the data

30 Complete class of mixture models Mixture models: likelihood : p(y t+1 k t+1, θ) transition equation : p(k t+1 k t, θ) parameter prior : p(θ). with k t states refer to a latent allocation of observations to mixture components, and k t = {k 1,..., k t }. State-space representation: y t+1 = f(k t+1, θ) (2) k t+1 = g(k t, θ) (3) where (2) is the observation equation and (3) is the evolution for states k t+1.

31 General class of hidden Markov models This structure establishes a direct link to the general class of hidden Markov models, which encompasses a vast number of widely used models. Density estimation: y t k(y t ; θ)dg(θ), the allocation states break the mixture such that y t k(y; θ kt ) and θ kt G(θ). Latent feature models: Multivariate k t allows allocation of a single observation to multiple mixture components.

32 Essential state vector In order to describe the PL algorithm, we begin by defining Z t as an essential state vector that will be tracked in time. We also assume that Z t is sufficient for sequential inference that is, it allows for the computation of: (a) Posterior predictive p(y t+1 Z t ), (b) Posterior updating rule p(z t+1 Z t, y t+1 ), (c) Parameter learning via p(θ Z t+1 ).

33 Particle learning (PL) The posterior density p(z t y t ) is approximated by the equally weighted particle set {Z (i) t } N i=1 = {Z (1) t,..., Z (N) t }. Then, given {Z (i) t } N i=1, the generic particle learning update for a new observation y t+1 proceeds in two steps: Step 1: Resample Z (i) t p(y t+1 Z (i) t ) Step 2: Propagate Z (i) t+1 p(z t+1 Z (i) t, y t+1 )

34 Bayes theorem This process can be understood by re-writing Bayes theorem as p(z t y t+1 ) p(y t+1 Z t )p(z t y t ) (4) p(z t+1 y t+1 ) = p(z t+1 Z t, y t+1 )dp(z t y t+1 ), (5) where P( ) refers to the appropriate measure. After resampling the initial particles with weights proportional to p(y t+1 Z t ) we have particles from p(z t y t+1 ). These particles are then propagated through p(z t+1 Z t, y t+1 ), leading to particles {Z (i) t+1 }N i=1 approximating p(z t+1 y t+1 ).

35 PL algorithm for general mixture models For t = 1,..., T Resample: Draw indexes ζ (1),..., ζ (N) Multinomial(ω t, N), with unnormalized weights given by ω t = {p(y t+1 Z (1) t ),..., p(y t+1 Z (N) t )}, Propagate: Z (j) t+1 p(z t+1 Z ζ(j) t, y t+1 ) j = 1,..., N. Learning θ: θ (j) p(θ Z (j) t+1 ) j = 1,..., N.

36 Nature of the essential state vector Z t In general, Z t does not need to include k t in order to be sufficient for the distributions listed in (a) to (c). This, along with moving the propagation step to the end, is what makes PL distinct from the present state of the art in particle filtering for mixtures. However, it is straightforward to obtain smoothed samples of k t from the full posterior, through an adaptation of the particle smoothing algorithm of Godsill, Doucet and West (2004).

37 Allocation PL provides a vehicle for drawing from the full posterior distribution of the allocation vector, p(k t y t ), through the backwards uncertainty update equation p(k t y t ) = p(k t Z t, y t )dp(z t y t ) = t p(k r Z t, y r )dp(z t y t ). r=1 We can directly approximate p(k t y t ) by sampling, for each particle Z (i) t and for r = t,..., 1, k r with probability p(k r = j Z t, y r ) p(y r k r = j, Z t )p(k r = j Z t ) leading to an O(N) algorithm for full posterior allocation.

38 Marginal likelihood Marginal likelihoods are of key importance in Bayesian model assessment, particularly when computing Bayes factors. MCMC: It is known that marginal likelihoods are hard to compute via MCMC schemes, mainly because most approximations are based on one of the following identities (or extensions) p(y n ) = p(y n θ)p(θ)dθ = p(y n θ)p(θ) p(θ y n ) SMC: Particle learning, on the other hand, can directly approximate the product rule of probabilities, i.e. n p N (y n ) = p N (y t y t 1 ) with t=1 p N (y t y t 1 ) = 1 N N i=1 p(y t Z (i) t 1 ). (6)

39 Density Estimation We consider density functions of the form, f (y; G) = k(y; θ)dg(θ). There are many possibilities for the prior on G: Finite mixture models: finite dimensional models. Dirichlet process: stick-breaking (Ferguson, 1973). Beta two-parameter processes (Ishawaran and Zarepour, 2000). Kernel stick-breaking processes (Dunson and Park, 2008). See Walker, Damien, Laud and Smith (1999) or Müller and Quintana (2004) for more complete overviews of the major modeling frameworks.

40 Collapsed state-space model An informal formulation of the collapsed state-space model is Z E [f (y t+1; G) Z t] = k(y t+1; θ)de [G(θ)] (7) Z E [dg(θ) Z t] = dg(θ)dp (dg(θ) Z t). (8) With t observations allocated to m t mixture components Xm t i E [dg(θ) Z t] = p 0dG 0(θ) + p j E hδ θ Z j t j=1 Z Xm t Z p (y t+1 Z t) = p 0 k(y t; θ)dg 0(θ) + p j k(y t; θj )dp(θj Z t) with θj the parameters for each of the m t components. j=1

41 Inference about the random mixing distribution All of the inference so far is based on the marginal posterior predictive, thus avoiding direct simulation of the infinite dimensional random mixing distribution. In some situations, however, it is necessary to obtain inference about the actual posterior for the random density f (y; G), and hence about G itself, rather than about E[f (y; G)]. For example, functionals of the conditional density f (x, y; G)/f (x; G) are the objects of inference in implied conditional regression (e.g., Taddy and Kottas, 2009).

42 Truncation The standard approach to sampling G is to apply a truncated version of the constructive definition to draw from DP ( α + t, G t 0(θ; n t, θ t ) ), the conjugate posterior for G given θ t and n t (Gelfand and Kottas, 2002). The approximate posterior draw G L {p l, ϑ l } L l=1 is built from i.i.d. point mass locations ϑ l G0 t(ϑ l; n t, θ t ) and the probability vector p = (p 1,..., p L ) from the finite stick- breaking process p l = v l (1 l 1 j=1 ) for l = 1,..., L, with v l beta(1, α + t) and v L = 1.

43 Illustration Data was simulated from y t N( x t sin(2.7x t ) + 1.1(1 + x 2 t ) 1, σ 2 t ) where x t N(0, 1) and { 0.5 w.p. Φ(xt ) σ t = 0.25 w.p. 1 Φ(x t ) The joint distribution of x and y is modeled as arising from DP-MVN with parameter learning and α = 2, ν = 3, κ = 0.1, γ λ = 0, Ψ λ = 1.5I, γ Ω = 3, and Ψ Ω = 0.1I.

44 Conditional densities After applying PL with N = 1000 particles to filter the posterior, truncated approximations G L with L = 300 were drawn (Taddy and Kottas, 2009). In particular, the conditional density is available at any location (x, y) as L l=1 p ln (x, y; µ l, Σ l ) L l=1 p ln(x; µ x l, σx l ), and the conditional mean at x is E [Y x; G L ] = L l=1 p ln(x; µ x l, σx l ) [ µ y l + ρ xy l (σ x l ) 1 (x µ x l )] L l=1 p ln(x; µ x l, σx l ).

45 Regression Left: filtered posterior mean estimate for the conditional density f (x, y; G L )/f (x; G L ), Right: posterior mean and 90% interval for the mean E[y x; G L ] Y X

46 Conclusion We have proposed a new estimation method for general mixture models. The approach is easy to understand, simple to implement, and computationally fast. A vast body of empirical and theoretical evidence of the robust behavior of the resample/propagate PL procedure in states space models appear in Carvalho, Johannes, Lopes and Polson (2009). Conditioning on sufficient statistics for states and parameters whenever possible creates a Rao-Blackwellized filter with more uniformly distributed resampling weights.

47 PL does not attempt to approximate the ever increasing joint posterior distribution for k t. It is self evident that any importance sampling approximation to the entire vector of allocations will eventually fail, due to the curse of dimensionality, as t grows. But we show that this is an irrelevant target, since the allocation problem can be effectively solved after filtering relevant sufficient information. Finally, we include an efficient framework for marginal likelihood estimation, providing a valuable tool for real-time sequential model selection.

48 The framework is especially appealing in the large class of nonparametric mixture priors where the predictive probability function is either available analytically or possible to approximate. To enable understanding, we have focused on a limited set of concrete models, while pointing to a more general applicability available with little change to the algorithm. It is thus hoped that this article will facilitate a wider adoption of sequential particle methods in nonparametric mixture model applications.

Particle Learning and Smoothing

Particle Learning and Smoothing Particle Learning and Smoothing Hedibert Freitas Lopes The University of Chicago Booth School of Business September 18th 2009 Instituto Nacional de Pesquisas Espaciais São José dos Campos, Brazil Joint

More information

Particle Learning and Smoothing

Particle Learning and Smoothing Particle Learning and Smoothing Carlos Carvalho, Michael Johannes, Hedibert Lopes and Nicholas Polson This version: September 2009 First draft: December 2007 Abstract In this paper we develop particle

More information

Lecture 4: Dynamic models

Lecture 4: Dynamic models linear s Lecture 4: s Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

STAT Advanced Bayesian Inference

STAT Advanced Bayesian Inference 1 / 32 STAT 625 - Advanced Bayesian Inference Meng Li Department of Statistics Jan 23, 218 The Dirichlet distribution 2 / 32 θ Dirichlet(a 1,...,a k ) with density p(θ 1,θ 2,...,θ k ) = k j=1 Γ(a j) Γ(

More information

Sequential Monte Carlo Methods for Bayesian Computation

Sequential Monte Carlo Methods for Bayesian Computation Sequential Monte Carlo Methods for Bayesian Computation A. Doucet Kyoto Sept. 2012 A. Doucet (MLSS Sept. 2012) Sept. 2012 1 / 136 Motivating Example 1: Generic Bayesian Model Let X be a vector parameter

More information

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017

Bayesian Statistics. Debdeep Pati Florida State University. April 3, 2017 Bayesian Statistics Debdeep Pati Florida State University April 3, 2017 Finite mixture model The finite mixture of normals can be equivalently expressed as y i N(µ Si ; τ 1 S i ), S i k π h δ h h=1 δ h

More information

Quantile Filtering and Learning

Quantile Filtering and Learning Quantile Filtering and Learning Michael Johannes, Nicholas Polson and Seung M. Yae October 5, 2009 Abstract Quantile and least-absolute deviations (LAD) methods are popular robust statistical methods but

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation

Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Bayesian Nonparametric Autoregressive Models via Latent Variable Representation Maria De Iorio Yale-NUS College Dept of Statistical Science, University College London Collaborators: Lifeng Ye (UCL, London,

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

Flexible Regression Modeling using Bayesian Nonparametric Mixtures

Flexible Regression Modeling using Bayesian Nonparametric Mixtures Flexible Regression Modeling using Bayesian Nonparametric Mixtures Athanasios Kottas Department of Applied Mathematics and Statistics University of California, Santa Cruz Department of Statistics Brigham

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Mixture Modeling for Marked Poisson Processes

Mixture Modeling for Marked Poisson Processes Mixture Modeling for Marked Poisson Processes Matthew A. Taddy taddy@chicagobooth.edu The University of Chicago Booth School of Business 5807 South Woodlawn Ave, Chicago, IL 60637, USA Athanasios Kottas

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

BAYESIAN MODEL CRITICISM

BAYESIAN MODEL CRITICISM Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Markov Chain Monte Carlo Methods for Stochastic

Markov Chain Monte Carlo Methods for Stochastic Markov Chain Monte Carlo Methods for Stochastic Optimization i John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U Florida, Nov 2013

More information

Bayesian Model Comparison:

Bayesian Model Comparison: Bayesian Model Comparison: Modeling Petrobrás log-returns Hedibert Freitas Lopes February 2014 Log price: y t = log p t Time span: 12/29/2000-12/31/2013 (n = 3268 days) LOG PRICE 1 2 3 4 0 500 1000 1500

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Particle Learning for Sequential Bayesian Computation Rejoinder

Particle Learning for Sequential Bayesian Computation Rejoinder BAYESIAN STATISTICS 9, J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith and M. West (Eds.) c Oxford University Press, 20 Particle Learning for Sequential Bayesian

More information

Particle Learning and Smoothing

Particle Learning and Smoothing Statistical Science 2010, Vol. 25, No. 1, 88 106 DOI: 10.1214/10-STS325 Institute of Mathematical Statistics, 2010 Particle Learning and Smoothing Carlos M. Carvalho, Michael S. Johannes, Hedibert F. Lopes

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Multivariate Normal & Wishart

Multivariate Normal & Wishart Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.

More information

Lecture 16: Mixtures of Generalized Linear Models

Lecture 16: Mixtures of Generalized Linear Models Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently

More information

Particle Filtering for Data-Driven Simulation and Optimization

Particle Filtering for Data-Driven Simulation and Optimization Particle Filtering for Data-Driven Simulation and Optimization John R. Birge The University of Chicago Booth School of Business Includes joint work with Nicholas Polson. JRBirge INFORMS Phoenix, October

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007 Sequential Monte Carlo and Particle Filtering Frank Wood Gatsby, November 2007 Importance Sampling Recall: Let s say that we want to compute some expectation (integral) E p [f] = p(x)f(x)dx and we remember

More information

Bayesian Nonparametrics: Dirichlet Process

Bayesian Nonparametrics: Dirichlet Process Bayesian Nonparametrics: Dirichlet Process Yee Whye Teh Gatsby Computational Neuroscience Unit, UCL http://www.gatsby.ucl.ac.uk/~ywteh/teaching/npbayes2012 Dirichlet Process Cornerstone of modern Bayesian

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions

Bayesian Nonparametric Inference Methods for Mean Residual Life Functions Bayesian Nonparametric Inference Methods for Mean Residual Life Functions Valerie Poynor Department of Applied Mathematics and Statistics, University of California, Santa Cruz April 28, 212 1/3 Outline

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Particle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007

Particle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007 Particle Filtering a brief introductory tutorial Frank Wood Gatsby, August 2007 Problem: Target Tracking A ballistic projectile has been launched in our direction and may or may not land near enough to

More information

Hyperparameter estimation in Dirichlet process mixture models

Hyperparameter estimation in Dirichlet process mixture models Hyperparameter estimation in Dirichlet process mixture models By MIKE WEST Institute of Statistics and Decision Sciences Duke University, Durham NC 27706, USA. SUMMARY In Bayesian density estimation and

More information

A Bayesian Nonparametric Approach to Inference for Quantile Regression

A Bayesian Nonparametric Approach to Inference for Quantile Regression A Bayesian Nonparametric Approach to Inference for Quantile Regression Matthew TADDY The University of Chicago Booth School of Business, Chicago, IL 60637 (matt.taddy@chicagogsb.edu) Athanasios KOTTAS

More information

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,

More information

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,

More information

A Nonparametric Model-based Approach to Inference for Quantile Regression

A Nonparametric Model-based Approach to Inference for Quantile Regression A Nonparametric Model-based Approach to Inference for Quantile Regression Matthew Taddy and Athanasios Kottas Department of Applied Mathematics and Statistics, University of California, Santa Cruz, CA

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Scott W. Linderman Matthew J. Johnson Andrew C. Miller Columbia University Harvard and Google Brain Harvard University Ryan

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Sequential Monte Carlo Methods (for DSGE Models)

Sequential Monte Carlo Methods (for DSGE Models) Sequential Monte Carlo Methods (for DSGE Models) Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 Some References These lectures use material from our joint work: Tempered

More information

CMPS 242: Project Report

CMPS 242: Project Report CMPS 242: Project Report RadhaKrishna Vuppala Univ. of California, Santa Cruz vrk@soe.ucsc.edu Abstract The classification procedures impose certain models on the data and when the assumption match the

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Bayesian Nonparametric Regression through Mixture Models

Bayesian Nonparametric Regression through Mixture Models Bayesian Nonparametric Regression through Mixture Models Sara Wade Bocconi University Advisor: Sonia Petrone October 7, 2013 Outline 1 Introduction 2 Enriched Dirichlet Process 3 EDP Mixtures for Regression

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Bayesian semiparametric modeling and inference with mixtures of symmetric distributions

Bayesian semiparametric modeling and inference with mixtures of symmetric distributions Bayesian semiparametric modeling and inference with mixtures of symmetric distributions Athanasios Kottas 1 and Gilbert W. Fellingham 2 1 Department of Applied Mathematics and Statistics, University of

More information

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract

Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract Bayesian analysis of a vector autoregressive model with multiple structural breaks Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus Abstract This paper develops a Bayesian approach

More information

Hybrid Dirichlet processes for functional data

Hybrid Dirichlet processes for functional data Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA

More information

SMC 2 : an efficient algorithm for sequential analysis of state-space models

SMC 2 : an efficient algorithm for sequential analysis of state-space models SMC 2 : an efficient algorithm for sequential analysis of state-space models N. CHOPIN 1, P.E. JACOB 2, & O. PAPASPILIOPOULOS 3 1 ENSAE-CREST 2 CREST & Université Paris Dauphine, 3 Universitat Pompeu Fabra

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

A Nonparametric Model for Stationary Time Series

A Nonparametric Model for Stationary Time Series A Nonparametric Model for Stationary Time Series Isadora Antoniano-Villalobos Bocconi University, Milan, Italy. isadora.antoniano@unibocconi.it Stephen G. Walker University of Texas at Austin, USA. s.g.walker@math.utexas.edu

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Bayesian Computations for DSGE Models

Bayesian Computations for DSGE Models Bayesian Computations for DSGE Models Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 This Lecture is Based on Bayesian Estimation of DSGE Models Edward P. Herbst &

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Modelling and Tracking of Dynamic Spectra Using a Non-Parametric Bayesian Method

Modelling and Tracking of Dynamic Spectra Using a Non-Parametric Bayesian Method Proceedings of Acoustics 213 - Victor Harbor 17 2 November 213, Victor Harbor, Australia Modelling and Tracking of Dynamic Spectra Using a Non-Parametric Bayesian Method ABSTRACT Dragana Carevic Maritime

More information

MONTE CARLO METHODS. Hedibert Freitas Lopes

MONTE CARLO METHODS. Hedibert Freitas Lopes MONTE CARLO METHODS Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US

Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Online appendix to On the stability of the excess sensitivity of aggregate consumption growth in the US Gerdie Everaert 1, Lorenzo Pozzi 2, and Ruben Schoonackers 3 1 Ghent University & SHERPPA 2 Erasmus

More information

An Brief Overview of Particle Filtering

An Brief Overview of Particle Filtering 1 An Brief Overview of Particle Filtering Adam M. Johansen a.m.johansen@warwick.ac.uk www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/ May 11th, 2010 Warwick University Centre for Systems

More information

Bayesian Monte Carlo Filtering for Stochastic Volatility Models

Bayesian Monte Carlo Filtering for Stochastic Volatility Models Bayesian Monte Carlo Filtering for Stochastic Volatility Models Roberto Casarin CEREMADE University Paris IX (Dauphine) and Dept. of Economics University Ca Foscari, Venice Abstract Modelling of the financial

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Lecture 3a: Dirichlet processes

Lecture 3a: Dirichlet processes Lecture 3a: Dirichlet processes Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced Topics

More information

A nonparametric Bayesian approach to inference for non-homogeneous. Poisson processes. Athanasios Kottas 1. (REVISED VERSION August 23, 2006)

A nonparametric Bayesian approach to inference for non-homogeneous. Poisson processes. Athanasios Kottas 1. (REVISED VERSION August 23, 2006) A nonparametric Bayesian approach to inference for non-homogeneous Poisson processes Athanasios Kottas 1 Department of Applied Mathematics and Statistics, Baskin School of Engineering, University of California,

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

Time-Sensitive Dirichlet Process Mixture Models

Time-Sensitive Dirichlet Process Mixture Models Time-Sensitive Dirichlet Process Mixture Models Xiaojin Zhu Zoubin Ghahramani John Lafferty May 25 CMU-CALD-5-4 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Abstract We introduce

More information

Nonparametric Bayes Uncertainty Quantification

Nonparametric Bayes Uncertainty Quantification Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

On the posterior structure of NRMI

On the posterior structure of NRMI On the posterior structure of NRMI Igor Prünster University of Turin, Collegio Carlo Alberto and ICER Joint work with L.F. James and A. Lijoi Isaac Newton Institute, BNR Programme, 8th August 2007 Outline

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Lorenzo Rosasco 9.520 Class 18 April 11, 2011 About this class Goal To give an overview of some of the basic concepts in Bayesian Nonparametrics. In particular, to discuss Dirichelet

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Particle Learning and Smoothing

Particle Learning and Smoothing Statistical Science 2010, Vol. 25, No. 1, 88 106 DOI: 10.1214/10-STS325 c Institute of Mathematical Statistics, 2010 Particle Learning and Smoothing Carlos M. Carvalho, Michael S. Johannes, Hedibert F.

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

Sequential Monte Carlo Methods

Sequential Monte Carlo Methods University of Pennsylvania Bradley Visitor Lectures October 23, 2017 Introduction Unfortunately, standard MCMC can be inaccurate, especially in medium and large-scale DSGE models: disentangling importance

More information

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 Permuted and IROM Department, McCombs School of Business The University of Texas at Austin 39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017 1 / 36 Joint work

More information

28 : Approximate Inference - Distributed MCMC

28 : Approximate Inference - Distributed MCMC 10-708: Probabilistic Graphical Models, Spring 2015 28 : Approximate Inference - Distributed MCMC Lecturer: Avinava Dubey Scribes: Hakim Sidahmed, Aman Gupta 1 Introduction For many interesting problems,

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops

More information

Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis

Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis Bayesian semiparametric modeling for stochastic precedence, with applications in epidemiology and survival analysis ATHANASIOS KOTTAS Department of Applied Mathematics and Statistics, University of California,

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information