The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic

Size: px
Start display at page:

Download "The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic"

Transcription

1 he Polya-Gamma Gibbs Sampler for Bayesian Logistic Regression is Uniformly Ergodic Hee Min Choi and James P. Hobert Department of Statistics University of Florida August 013 Abstract One of the most widely used data augmentation algorithms is Albert and Chib s (1993) algorithm for Bayesian probit regression. Polson, Scott and Windle (013) recently introduced an analogous algorithm for Bayesian logistic regression. he main difference between the two is that Albert and Chib s (1993) truncated normals are replaced by so-called Polya-Gamma random variables. In this note, we establish that the Markov chain underlying Polson et al. s (013) algorithm is uniformly ergodic. his theoretical result has important practical benefits. In particular, it guarantees the existence of central limit theorems that can be used to make an informed decision about how long the simulation should be run. 1 Introduction Consider a binary regression set-up in which Y 1,..., Y n are independent Bernoulli random variables such that Pr(Y i = 1 β) = F (x i β), where x i is a p 1 vector of known covariates associated with Y i, β is AMS 000 subject classifications. Primary 60J7; secondary 6F15 Abbreviated title. Gibbs sampler for Bayesian logistic regression Key words and phrases. Polya-Gamma distribution, Data augmentation algorithm, Minorization condition, Markov chain, Monte Carlo 1

2 a p 1 vector of unknown regression coefficients, and F : R (0, 1) is a distribution function. wo important special cases are probit regression, where F is the standard normal distribution function, and logistic regression, where F is the standard logistic distribution function, that is, F (t) = e t /(1 + e t ). In general, the joint mass function of Y 1,..., Y n is given by n Pr(Y i = y i β) = n F (x i β) ] y i 1 F (x i β) ] 1 y i I 0,1} (y i ). A Bayesian version of the model requires a prior distribution for the unknown regression parameter, β. If π(β) is the prior density for β, then the posterior density of β given the data, y = (y 1,..., y n ), is defined as π(β y) = π(β) c(y) n F (x i β) ] y i 1 F (x i β) ] 1 y i, where c(y) is the normalizing constant, that is, n c(y) := π(β) F (x i β) ] y i 1 F (x i β) ] 1 y i dβ. R p Regardless of the choice of F, the posterior density is intractable in the sense that expectations with respect to π(β y), which are required for Bayesian inference, cannot be computed in closed form. Moreover, classical Monte Carlo methods based on independent and identically distributed (iid) samples from π(β y) are problematic when the dimension, p, is large. hese difficulties have spurred the development of many Markov chain Monte Carlo methods for exploring π(β y). One of the most popular is a simple data augmentation (DA) algorithm for the probit regression case that was developed by Albert and Chib (1993). Each iteration of this algorithm has two steps, the first entails the simulation of n independent univariate truncated normals, and the second requires a draw from the p-variate normal distribution. A convergence rate analysis of the underlying Markov chain can be found in Roy and Hobert (007). Since the publication of Albert and Chib (1993), the search has been on for an analogous algorithm for Bayesian logistic regression. Attempts to develop such an analogue by mimicking the missing data argument of Albert and Chib (1993) in the logistic case have not been entirely successful. In particular, the resulting algorithms are much more complicated than that of Albert and Chib (1993), and simplified versions

3 of them are inexact (see, e.g., Frühwirth-Schnatter and Frühwirth, 010; Holmes and Held, 006; Marchev, 011). Using a different approach, Polson et al. (013) (hereafter PS&W) have developed a real analogue of Albert and Chib s (1993) algorithm for Bayesian logistic regression. he main difference between the two algorithms is that the truncated normals in Albert and Chib s (1993) algorithm are replaced by Polya-Gamma random variables in PS&W s algorithm. We now describe the new algorithm. In the remainder of this paper, we restrict attention to the logistic regression model with a proper N p (b, B) prior on β. As usual, let X denote the n p matrix whose ith row is x i, and let R + = (0, ). For fixed w, define Σ(w) = ( X Ω(w)X + B 1) 1 ( and m(w) = Σ(w) X (y 1 1 n) + B 1 b ), where Ω(w) is the n n diagonal matrix whose ith diagonal element is w i, and 1 n is an n 1 vector of 1s. Finally, let PG(1, c) denote the Polya-Gamma distribution with parameters 1 and c, which is carefully defined in Section. he dynamics of PS&W s Markov chain are defined (implicitly) through the following two-step procedure for moving from the current state, β (m) = β, to β (m+1). Iteration m + 1 of PS&W s DA algorithm: 1. Draw W 1,..., W n independently with W i PG(1, x i β ), and call the observed value w = (w 1,..., w n ). Draw β (m+1) N p ( m(w), Σ(w) ) Highly efficient methods of simulating Polya-Gamma random variables are provided by PS&W. In this paper, we prove that PS&W s Markov chain is uniformly ergodic. his theoretical convergence rate result is extremely important when it comes to using PS&W s algorithm in practice to estimate intractable posterior expectations. Indeed, uniform ergodicity guarantees the existence of central limit theorems for averages such as m 1 m 1 i=0 g( β (i)), where g : R p R is square integrable with respect to 3

4 the target posterior, π(β y) (see, e.g., Roberts and Rosenthal, 004; ierney, 1994). It also allows for the computation of a consistent estimator of the associated asymptotic variance, which can be used to make an informed decision about how long the simulation should be run (see, e.g., Flegal, Haran and Jones, 008). We also establish the existence of the moment generating function (mgf) of the posterior distribution. It follows that, if g(β 1,..., β p ) = β a i for some i 1,,..., p} and a > 0, then g is square integrable with respect to the posterior. he remainder of this paper is organized as follows. Section contains a brief, but careful development of PS&W s DA algorithm. Section 3 contains the proof that the Markov chain underlying this algorithm is uniformly ergodic, as well as the proof that the posterior density, π(β y), has an mgf. Polson, Scott and Windle s Algorithm PS&W s DA algorithm is based on a latent data representation of the posterior distribution. o describe it, we need to introduce what PS&W call the Polya-Gamma distribution. Let E k } k=1 be a sequence of iid Exp(1) random variables and define W = π k=1 E k (k 1). It is well-known (see, e.g., Biane, Pitman and Yor, 001) that the random variable W has density g(w) = k (k + 1) (k+1) ( 1) e 8w I πw 3 (0, ) (w), (1) k=0 and that its Laplace transform is given by E e tw ] = cosh 1 ( t/ ). (Recall that cosh(z) = (e z + e z )/.) PS&W create the Polya-Gamma family of densities through an exponential tilting of the density g. Indeed, consider a parametric family of densities, indexed by c 0, that takes the form f(x; c) = cosh(c/) e c x g(x). 4

5 Of course, when c = 0, we recover the original density. A random variable with density f(x; c) is said to have a PG(1, c) distribution. We now describe a latent data formulation that leads to PS&W s DA algorithm. Our development is different, and we believe somewhat more transparent, than that given by PS&W. Conditional on β, let (Y i, W i )} n be independent random pairs such that Y i and W i are also independent, with Y i Bernoulli ( e x i β /(1 + e x i β ) ) and W i PG(1, x i β ). Let W = (W 1,..., W n ) and denote its density by f(w β). Combining this latent data model with the prior, π(β), yields the augmented posterior density defined as π(β, w y) = n Pr(Y i = y i β) ] f(w β) π(β) c(y). Clearly, π(β, w y) dw = π(β y), which is our target posterior density. PS&W s DA algorithm alternates between draws from π(β w, y) and π(w β, y). he conditional independence of Y i and W i implies that π(w β, y) = f(w β). hus, we can draw from π(w β, y) by making n independent draws from the Polya-Gamma distribution (as in the first step of the two-step procedure described in the Introduction). he other conditional density is multivariate normal. o see this, note that n ] π(β w, y) Pr(Y i = y i β) f(w β)π(β) n ( e x β) ] y i i n ( x = cosh i β 1 + e x i β n ( e x β) y i ] i ( x π(β) cosh i β 1 + e x i β n = n π(β) exp ] ) e (x i β) w i g(w i ) π(β) ) e (x i β) w i y i x i β x i β (x i β) w i where the last equality follows from the fact that cosh(z) = (1+e z )/(e z ). A routine Bayesian regressiontype calculation then reveals that β w, y N p (m(w), Σ(w)), where m(w) and Σ(w) are defined in the Introduction. }, 5

6 he Markov transition density (Mtd) of the DA Markov chain, Φ = β (m) } m=0, is k ( β β ) = π(β w, y) π(w β, y) dw. () Of course, we do not have to perform the integration in (), but we do need to be able to simulate random vectors with density k ( β ), and this is exactly what the two-step procedure described in the Introduction does. Note that k : R p R p (0, ); that is, k is strictly positive. It follows immediately that the Markov chain Φ is irreducible, aperiodic and Harris recurrent (see, e.g., Hobert, 011). 3 Uniform Ergodicity For m 1,, 3,... }, let k m : R p R p (0, ) denote the m-step Mtd of PS&W s Markov chain, where k 1 k. hese densities are defined inductively as follows k m( β β ) = k m 1( β β ) k ( β β ) dβ. R p he conditional density of β (m) given β (0) = β is precisely k m( β ). he Markov chain Φ is geometrically ergodic if there exist M : R p 0, ) and ρ 0, 1) such that, for all m, R p k m( β β ) π(β y) dβ M(β ) ρ m. (3) he quantity on the left-hand side of (3) is the total variation distance between the posterior distribution and the distribution of β (m) (given β (0) = β ). If the function M( ) is bounded above, then the chain is uniformly ergodic. One method of establishing uniform ergodicity is to construct a minorization condition that holds on the entire state space. In particular, if we can find a δ > 0 and a density function h : R p 0, ) such that, for all β, β R p, k ( β β ) δ h(β), (4) then the chain is uniformly ergodic. In fact, if (4) holds, then (3) holds with M 1 and ρ = 1 δ (see, e.g., Jones and Hobert, 001, Section 3.). 6

7 In order to state our main result, we need a bit more notation. Let φ(z; µ, V ) denote the multivariate normal density with mean µ and covariance matrix V evaluated at the point z. Recall that the prior on β is N p (b, B), and define ( s = B 1 X y 1 ) 1 n + B 1 b. Here is our main result. Proposition 1. he Mtd of Φ satisfies the following minorization condition k ( β β ) δφ(β; m, Σ ), where m = ( 1 X X + B 1) 1 B 1 s, Σ = ( 1 X X + B 1) 1, and δ = Σ 1/ exp B 1/ 1 m Σ 1 m } e n 4 n exp 1 s s }. Hence, PS&W s Markov chain is uniformly ergodic. Remark 1. As discussed in the Introduction, uniform ergodicity guarantees the existence of central limit theorems and consistent estimators of the associated asymptotic variances, and these can be used to make an informed decision about how long the simulation should be run. While it is true that, in many cases, δ will be so close to zero that the bound (3) (with M 1 and ρ = 1 δ) will not be useful for getting a handle on the total variation distance, this does not detract from the usefulness of the result. Moreover, it is certainly possible that a different analysis could yield a different minorization condition with a larger δ. he following easily established lemmas will be used in the proof of Proposition 1. Lemma 1. If A is a symmetric nonnegative definite matrix, then all of the eigenvalues of (I + A) 1 are in (0, 1], and I (I + A) 1 is also nonnegative definite. Lemma. For a, b R, cosh(a + b) cosh(a) cosh(b). Proof of Proposition 1. Recall that Σ = Σ(w) = ( X Ω(w)X +B 1) 1 and m = m(w) = Σ(w) ( X (y 1 1 n) + B 1 b ). We begin by showing that Σ 1 B 1. Indeed, Σ = (X ΩX + B 1 ) 1 = ( B 1 B 1 X ΩXB 1 B 1 + B 1 B 1 ) 1 = B ( X Ω X + I) 1, 7

8 where X = XB 1. Now, since X Ω X is nonnegative definite, Lemma 1 implies that Σ B, and the result follows. Next, we show that m Σ 1 m s s. Letting l = y 1 1 n, we have m Σ 1 m = (X l + B 1 b) (X ΩX + B 1 ) 1 (X l + B 1 b) = (X l + B 1 b) B 1 ( X Ω X + I) 1 B 1 (X l + B 1 b) = ( X l + B 1 b) ( X Ω X + I) 1 ( X l + B 1 b) ( X l + B 1 b) ( X l + B 1 b) = s s, where the inequality follows from Lemma 1. Using these two inequalities, we have π(β w, y) = (π) p Σ 1 exp 1 } (β m) Σ 1 (β m) Now since it follows that = (π) p Σ 1 exp 1 β (X ΩX)β 1 β B 1 β 1 } m Σ 1 m + m Σ 1 β (π) p B 1 exp 1 β (X ΩX)β 1 β B 1 β 1 } s s + m Σ 1 β = (π) p B 1 exp 1 = (π) p B 1 exp π(w β, y) = n w i (x i β) 1 β B 1 β 1 } s s + s B 1 β 1 β B 1 β 1 s s + s B 1 β } n n ( ) } x cosh i β exp (x i β) w i g(w i ), π(β w, y)π(w β, y) (π) p B 1 exp 1 β B 1 β 1 s s + s B 1 β } n ( x cosh i β ) exp (x i β) + (x i β ) } ] exp (x i β) w i. } ] ]w i g(w i ). 8

9 Recall that k(β β ) = R π(β w, y) π(w β, y) dw. o this end, note that n + (x exp i β) + (x i β ) } (x ]w i i g(w i ) dw i = cosh β) + (x i β ) } 1 R + ( )} 1 x i cosh β + x i β ( ) ( )} 1 x i cosh β x i cosh β, where the first inequality is due to the fact that a + b a + b for non-negative a, b, and the second inequality is by Lemma. It follows that n ( x cosh i β ) exp (x i β) + (x i β ) } ] n ( ) ] 1 x w i g(w i ) dw n cosh i β. Moreover, n ( ) ] 1 x cosh i β n ( ) ] 1 x exp i β n ( (x exp i β) ) ] = exp 1 n ( (x i β) + 1 ) } = e n 4 exp 1 ( β X ) } Xβ, where the second inequality holds because a (a + 1)/ for any real a. Putting all of this together, we have k(β β ) = π(β w, y) π(w β, y) dw (π) p B 1 exp 1 β B 1 β 1 s s + s B 1 β } n e n 4 exp 1 = (π) p B 1 exp 1 ( X X β + B 1) } β + s B 1 β n e n 4 s s } Σ 1 B 1 n e n 4 s s = (π) p Σ 1 exp 1 (β m ) Σ 1 (β m ) = δ φ(β; m, Σ ), ( β X ) } Xβ + m Σ 1 m 9

10 and the proof is complete. Remark. It is worth pointing out that our proof circumvents the need to deal directly with the unwieldy density of the PG(1, 0) distribution, which is given in (1). he utility of the Polya-Gamma latent data strategy extends far beyond the basic logistic regression model. Indeed, PS&W use this technique to develop useful Gibbs samplers for a variety of Bayesian hierarchical models in which a logit link is employed. hese include Bayesian logistic regression with random effects and negative-binomial regression. Of course, given the results herein, it is natural to ask whether these other Gibbs samplers are also uniformly ergodic, and to what extent our methods could be used to answer this question. We note, however, that the Mtd of the Polya-Gamma Gibbs sampler is relatively simple. For example, the Gibbs sampler for the mixed effects model has more than two-steps, so its Mtd is significantly more complex than that of the Polya-Gamma Gibbs sampler. hus, the development of (uniform) minorization conditions for the other Gibbs samplers would likely entail more than just a straightforward extension of the proof of Proposition 1. On the other hand, it is possible that some of the inequalities in that proof could be recycled. Finally, we establish that the posterior distribution of β given the data has a moment generating function. Proposition. For any fixed t R p, R p e β t π(β y) dβ <. Hence, the moment generating function of the posterior distribution exists. Remark 3. Chen and Shao (000) develop results for Bayesian logistic regression with a flat (improper) prior on β. In particular, these authors provide conditions on X and y that guarantee the existence of the mgf of the posterior of β given the data. Note that our result holds for all X and y Proof. Recall that π(β, w y) dw = π(β y), 10

11 where π(β, w y) = n Pr(Y i = y i β) ] f(w β) π(β) c(y). Hence, it suffices to show that R p e β t π(β, w y) dβ dw <. Straightforward calculations similar to those done in Section show that π(β, w y) = φ(β; m, Σ) n c(y) Σ 1 B 1 exp 1 } } 1 n b B 1 b exp m Σ 1 m g(w i ). hen, since Σ B and m Σ 1 m s s, we have π(β, w y) φ(β; m, Σ) n c(y) exp 1 } } 1 n b B 1 b exp s s g(w i ). Now, using the formula for the multivariate normal mgf, it suffices to show that n ] e β t φ(β; m, Σ) g(w i ) dβ dw = exp m t + 1 } n ] R p R n t Σt g(w i ) dw <. + We establish this by demonstrating that m t + 1 t Σt is uniformly bounded in w. For a matrix A, define A = sup x =1 Ax. Of course, if A is non-negative definite, then A is equal to the largest eigenvalue of A. Now, using Cauchy-Schwartz and properties of the norm, we have m t m t = B 1 B 1 m t B 1 B 1 m t. Now since B 1 m = ( X Ω X + I ) 1( X (y 1 1 n) + B 1 b ), we have B 1 m = ( X Ω X + I ) ( ( 1 X y 1 ) ) 1 n + B 1 b ( X Ω X + I ) ( 1 X y 1 ) 1 ( n + X Ω X + I ) 1 B 1 b ( X Ω X + I ) 1 ( X y 1 ) 1 ( n + X Ω X + I ) 1 B 1 b X ( y 1 n) 1 + B 1 b, 11

12 where the first inequality is due to the fact that a + b a + b for any vectors a and b, and the third inequality is due to the fact that ( X Ω X + I ) 1 1 (Lemma 1). Hence, m t is uniformly bounded in w. Finally, another application of Lemma 1 yields t Σt = t ( X ΩX + B 1) 1 t = t ( B 1 X Ω X + I ) 1 1 B t t Bt, so t Σt is also uniformly bounded in w, and the result follows. Acknowledgment. he second author was supported by NSF Grant DMS References ALBER, J. H. and CHIB, S. (1993). Bayesian analysis of binary and polychotomous response data. Journal of the American Statistical Association, BIANE, P., PIMAN, J. and YOR, M. (001). Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions. Bulletin of the American Mathematical Society, CHEN, M.-H. and SHAO, Q.-M. (000). Propriety of posterior distribution for dichotomous quantal response models. Proceedings of the American Mathematical Society, FLEGAL, J. M., HARAN, M. and JONES, G. L. (008). Markov chain Monte Carlo: Can we trust the third significant figure? Statistical Science, FRÜHWIRH-SCHNAER, S. and FRÜHWIRH, R. (010). Data augmentation and MCMC for binary and multinomial logit models. In Statistical Modelling and Regression Structures. Springer-Verlag, HOBER, J. P. (011). he data augmentation algorithm: heory and methodology. In Handbook of Markov Chain Monte Carlo (S. Brooks, A. Gelman, G. Jones and X.-L. Meng, eds.). Chapman & Hall/CRC Press. HOLMES, C. and HELD, L. (006). Bayesian auxiliary variable models for binary and multinomial regression. Bayesian Analysis,

13 JONES, G. L. and HOBER, J. P. (001). Honest exploration of intractable probability distributions via Markov chain Monte Carlo. Statistical Science, MARCHEV, D. (011). Markov chain Monte Carlo algorithms for the Bayesian logistic regression model. In Proceedings of the Annual International Conference on Operations Research and Statistics POLSON, N. G., SCO, J. G. and WINDLE, J. (013). Bayesian inference for logistic models using Polya- Gamma latent variables. Journal of the American Statistical Association (to appear). ArXiv: v3. ROBERS, G. O. and ROSENHAL, J. S. (004). General state space Markov chains and MCMC algorithms. Probability Surveys, ROY, V. and HOBER, J. P. (007). Convergence rates and asymptotic standard errors for Markov chain Monte Carlo algorithms for Bayesian probit regression. Journal of the Royal Statistical Society, Series B, IERNEY, L. (1994). Markov chains for exploring posterior distributions (with discussion). he Annals of Statistics,

Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance

Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance Electronic Journal of Statistics Vol. (207) 326 337 ISSN: 935-7524 DOI: 0.24/7-EJS227 Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance Hee Min Choi Department of Statistics

More information

Geometric ergodicity of the Bayesian lasso

Geometric ergodicity of the Bayesian lasso Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

Convergence Analysis of MCMC Algorithms for Bayesian Multivariate Linear Regression with Non-Gaussian Errors

Convergence Analysis of MCMC Algorithms for Bayesian Multivariate Linear Regression with Non-Gaussian Errors Convergence Analysis of MCMC Algorithms for Bayesian Multivariate Linear Regression with Non-Gaussian Errors James P. Hobert, Yeun Ji Jung, Kshitij Khare and Qian Qin Department of Statistics University

More information

Dynamic Generalized Linear Models

Dynamic Generalized Linear Models Dynamic Generalized Linear Models Jesse Windle Oct. 24, 2012 Contents 1 Introduction 1 2 Binary Data (Static Case) 2 3 Data Augmentation (de-marginalization) by 4 examples 3 3.1 Example 1: CDF method.............................

More information

Asymptotically Stable Drift and Minorization for Markov Chains. with Application to Albert and Chib s Algorithm

Asymptotically Stable Drift and Minorization for Markov Chains. with Application to Albert and Chib s Algorithm Asymptotically Stable Drift and Minorization for Markov Chains with Application to Albert and Chib s Algorithm Qian Qin and James P. Hobert Department of Statistics University of Florida December 2017

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Applicability of subsampling bootstrap methods in Markov chain Monte Carlo

Applicability of subsampling bootstrap methods in Markov chain Monte Carlo Applicability of subsampling bootstrap methods in Markov chain Monte Carlo James M. Flegal Abstract Markov chain Monte Carlo (MCMC) methods allow exploration of intractable probability distributions by

More information

Bayesian Multivariate Logistic Regression

Bayesian Multivariate Logistic Regression Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of

More information

Convergence complexity analysis of Albert and Chib s algorithm for Bayesian probit regression

Convergence complexity analysis of Albert and Chib s algorithm for Bayesian probit regression Convergence complexity analysis of Albert and Chib s algorithm for Bayesian probit regression Qian Qin and James P. Hobert Department of Statistics University of Florida April 2018 Abstract The use of

More information

A Comparison Theorem for Data Augmentation Algorithms with Applications

A Comparison Theorem for Data Augmentation Algorithms with Applications A Comparison Theorem for Data Augmentation Algorithms with Applications Hee Min Choi Department of Statistics University of California, Davis James P. Hobert Department of Statistics University of Florida

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Norm Comparisons for Data Augmentation by James P. Hobert Department of Statistics University of Florida and Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0704

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Xia Wang and Dipak K. Dey

Xia Wang and Dipak K. Dey A Flexible Skewed Link Function for Binary Response Data Xia Wang and Dipak K. Dey Technical Report #2008-5 June 18, 2008 This material was based upon work supported by the National Science Foundation

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling

Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling James P. Hobert Department of Statistics University of Florida Vivekananda Roy

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration

Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration Aixin Tan and James P. Hobert Department of Statistics, University of Florida May 4, 009 Abstract

More information

MARGINAL MARKOV CHAIN MONTE CARLO METHODS

MARGINAL MARKOV CHAIN MONTE CARLO METHODS Statistica Sinica 20 (2010), 1423-1454 MARGINAL MARKOV CHAIN MONTE CARLO METHODS David A. van Dyk University of California, Irvine Abstract: Marginal Data Augmentation and Parameter-Expanded Data Augmentation

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Spectral properties of Markov operators in Markov chain Monte Carlo

Spectral properties of Markov operators in Markov chain Monte Carlo Spectral properties of Markov operators in Markov chain Monte Carlo Qian Qin Advisor: James P. Hobert October 2017 1 Introduction Markov chain Monte Carlo (MCMC) is an indispensable tool in Bayesian statistics.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Gibbs Sampling for the Probit Regression Model with Gaussian Markov Random Field Latent Variables

Gibbs Sampling for the Probit Regression Model with Gaussian Markov Random Field Latent Variables Gibbs Sampling for the Probit Regression Model with Gaussian Markov Random Field Latent Variables Mohammad Emtiyaz Khan Department of Computer Science University of British Columbia May 8, 27 Abstract

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo

On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo James P. Hobert 1, Galin L. Jones 2, Brett Presnell 1, and Jeffrey S. Rosenthal 3 1 Department of Statistics University of Florida

More information

Estimating the spectral gap of a trace-class Markov operator

Estimating the spectral gap of a trace-class Markov operator Estimating the spectral gap of a trace-class Markov operator Qian Qin, James P. Hobert and Kshitij Khare Department of Statistics University of Florida April 2017 Abstract The utility of a Markov chain

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 11 Markov Chain Monte Carlo cont. October 6, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university The two stage Gibbs sampler If the conditional distributions are easy to sample

More information

CONVERGENCE ANALYSIS OF THE GIBBS SAMPLER FOR BAYESIAN GENERAL LINEAR MIXED MODELS WITH IMPROPER PRIORS. Vanderbilt University & University of Florida

CONVERGENCE ANALYSIS OF THE GIBBS SAMPLER FOR BAYESIAN GENERAL LINEAR MIXED MODELS WITH IMPROPER PRIORS. Vanderbilt University & University of Florida Submitted to the Annals of Statistics CONVERGENCE ANALYSIS OF THE GIBBS SAMPLER FOR BAYESIAN GENERAL LINEAR MIXED MODELS WITH IMPROPER PRIORS By Jorge Carlos Román and James P. Hobert Vanderbilt University

More information

Marginal Markov Chain Monte Carlo Methods

Marginal Markov Chain Monte Carlo Methods Marginal Markov Chain Monte Carlo Methods David A. van Dyk Department of Statistics, University of California, Irvine, CA 92697 dvd@ics.uci.edu Hosung Kang Washington Mutual hosung.kang@gmail.com June

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Kobe University Repository : Kernel

Kobe University Repository : Kernel Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI URL Note on the Sampling Distribution for the Metropolis-

More information

HIERARCHICAL BAYESIAN ANALYSIS OF BINARY MATCHED PAIRS DATA

HIERARCHICAL BAYESIAN ANALYSIS OF BINARY MATCHED PAIRS DATA Statistica Sinica 1(2), 647-657 HIERARCHICAL BAYESIAN ANALYSIS OF BINARY MATCHED PAIRS DATA Malay Ghosh, Ming-Hui Chen, Atalanta Ghosh and Alan Agresti University of Florida, Worcester Polytechnic Institute

More information

Proofs for Large Sample Properties of Generalized Method of Moments Estimators

Proofs for Large Sample Properties of Generalized Method of Moments Estimators Proofs for Large Sample Properties of Generalized Method of Moments Estimators Lars Peter Hansen University of Chicago March 8, 2012 1 Introduction Econometrica did not publish many of the proofs in my

More information

Partial factor modeling: predictor-dependent shrinkage for linear regression

Partial factor modeling: predictor-dependent shrinkage for linear regression modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Index Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Adaptive rejection metropolis sampling (ARMS), 98 Adaptive shrinkage, 132 Advanced Photo System (APS), 255 Aggregation

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Hamiltonian Monte Carlo

Hamiltonian Monte Carlo Chapter 7 Hamiltonian Monte Carlo As with the Metropolis Hastings algorithm, Hamiltonian (or hybrid) Monte Carlo (HMC) is an idea that has been knocking around in the physics literature since the 1980s

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

VARIABLE TRANSFORMATION TO OBTAIN GEOMETRIC ERGODICITY IN THE RANDOM-WALK METROPOLIS ALGORITHM

VARIABLE TRANSFORMATION TO OBTAIN GEOMETRIC ERGODICITY IN THE RANDOM-WALK METROPOLIS ALGORITHM Submitted to the Annals of Statistics VARIABLE TRANSFORMATION TO OBTAIN GEOMETRIC ERGODICITY IN THE RANDOM-WALK METROPOLIS ALGORITHM By Leif T. Johnson and Charles J. Geyer Google Inc. and University of

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 15-7th March 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Mixture and composition of kernels. Hybrid algorithms. Examples Overview

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Markov Chain Monte Carlo: Can We Trust the Third Significant Figure?

Markov Chain Monte Carlo: Can We Trust the Third Significant Figure? Markov Chain Monte Carlo: Can We Trust the Third Significant Figure? James M. Flegal School of Statistics University of Minnesota jflegal@stat.umn.edu Murali Haran Department of Statistics The Pennsylvania

More information

Efficient Sampling Methods for Truncated Multivariate Normal and Student-t Distributions Subject to Linear Inequality Constraints

Efficient Sampling Methods for Truncated Multivariate Normal and Student-t Distributions Subject to Linear Inequality Constraints Efficient Sampling Methods for Truncated Multivariate Normal and Student-t Distributions Subject to Linear Inequality Constraints Yifang Li Department of Statistics, North Carolina State University 2311

More information

An ABC interpretation of the multiple auxiliary variable method

An ABC interpretation of the multiple auxiliary variable method School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2016-07 27 April 2016 An ABC interpretation of the multiple auxiliary variable method by Dennis Prangle

More information

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems

Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Bayesian Learning and Inference in Recurrent Switching Linear Dynamical Systems Scott W. Linderman Matthew J. Johnson Andrew C. Miller Columbia University Harvard and Google Brain Harvard University Ryan

More information

Research Division Federal Reserve Bank of St. Louis Working Paper Series

Research Division Federal Reserve Bank of St. Louis Working Paper Series Research Division Federal Reserve Bank of St Louis Working Paper Series Kalman Filtering with Truncated Normal State Variables for Bayesian Estimation of Macroeconomic Models Michael Dueker Working Paper

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

BAYESIAN ANALYSIS OF BINARY REGRESSION USING SYMMETRIC AND ASYMMETRIC LINKS

BAYESIAN ANALYSIS OF BINARY REGRESSION USING SYMMETRIC AND ASYMMETRIC LINKS Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series B, Pt. 3, pp. 372 387 BAYESIAN ANALYSIS OF BINARY REGRESSION USING SYMMETRIC AND ASYMMETRIC LINKS By SANJIB BASU Northern Illinis University,

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression

On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression Joyee Ghosh, Yingbo Li, Robin Mitra arxiv:157.717v2 [stat.me] 9 Feb 217 Abstract In logistic regression, separation occurs when

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016

Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016 Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016 1 Theory This section explains the theory of conjugate priors for exponential families of distributions,

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Exploring Monte Carlo Methods

Exploring Monte Carlo Methods Exploring Monte Carlo Methods William L Dunn J. Kenneth Shultis AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO ELSEVIER Academic Press Is an imprint

More information

Control Variates for Markov Chain Monte Carlo

Control Variates for Markov Chain Monte Carlo Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

Coupled Hidden Markov Models: Computational Challenges

Coupled Hidden Markov Models: Computational Challenges .. Coupled Hidden Markov Models: Computational Challenges Louis J. M. Aslett and Chris C. Holmes i-like Research Group University of Oxford Warwick Algorithms Seminar 7 th March 2014 ... Hidden Markov

More information

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS The Annals of Applied Probability 1998, Vol. 8, No. 4, 1291 1302 ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS By Gareth O. Roberts 1 and Jeffrey S. Rosenthal 2 University of Cambridge

More information

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal?

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Dale J. Poirier and Deven Kapadia University of California, Irvine March 10, 2012 Abstract We provide

More information

Local consistency of Markov chain Monte Carlo methods

Local consistency of Markov chain Monte Carlo methods Ann Inst Stat Math (2014) 66:63 74 DOI 10.1007/s10463-013-0403-3 Local consistency of Markov chain Monte Carlo methods Kengo Kamatani Received: 12 January 2012 / Revised: 8 March 2013 / Published online:

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice

Wageningen Summer School in Econometrics. The Bayesian Approach in Theory and Practice Wageningen Summer School in Econometrics The Bayesian Approach in Theory and Practice September 2008 Slides for Lecture on Qualitative and Limited Dependent Variable Models Gary Koop, University of Strathclyde

More information

Variable Selection for Multivariate Logistic Regression Models

Variable Selection for Multivariate Logistic Regression Models Variable Selection for Multivariate Logistic Regression Models Ming-Hui Chen and Dipak K. Dey Journal of Statistical Planning and Inference, 111, 37-55 Abstract In this paper, we use multivariate logistic

More information

A fast sampler for data simulation from spatial, and other, Markov random fields

A fast sampler for data simulation from spatial, and other, Markov random fields A fast sampler for data simulation from spatial, and other, Markov random fields Andee Kaplan Iowa State University ajkaplan@iastate.edu June 22, 2017 Slides available at http://bit.ly/kaplan-phd Joint

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Label Switching and Its Simple Solutions for Frequentist Mixture Models

Label Switching and Its Simple Solutions for Frequentist Mixture Models Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Geometric Ergodicity of a Random-Walk Metorpolis Algorithm via Variable Transformation and Computer Aided Reasoning in Statistics

Geometric Ergodicity of a Random-Walk Metorpolis Algorithm via Variable Transformation and Computer Aided Reasoning in Statistics Geometric Ergodicity of a Random-Walk Metorpolis Algorithm via Variable Transformation and Computer Aided Reasoning in Statistics a dissertation submitted to the faculty of the graduate school of the university

More information

Invariant HPD credible sets and MAP estimators

Invariant HPD credible sets and MAP estimators Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture

More information

Gibbs Sampling in Endogenous Variables Models

Gibbs Sampling in Endogenous Variables Models Gibbs Sampling in Endogenous Variables Models Econ 690 Purdue University Outline 1 Motivation 2 Identification Issues 3 Posterior Simulation #1 4 Posterior Simulation #2 Motivation In this lecture we take

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

Improving the Data Augmentation algorithm in the two-block setup

Improving the Data Augmentation algorithm in the two-block setup Improving the Data Augmentation algorithm in the two-block setup Subhadip Pal, Kshitij Khare and James P. Hobert University of Florida Abstract The Data Augmentation (DA) approach to approximate sampling

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University Econ 690 Purdue University In virtually all of the previous lectures, our models have made use of normality assumptions. From a computational point of view, the reason for this assumption is clear: combined

More information

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract Bayesian Estimation of A Distance Functional Weight Matrix Model Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies Abstract This paper considers the distance functional weight

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

Markov Chain Monte Carlo for Linear Mixed Models

Markov Chain Monte Carlo for Linear Mixed Models Markov Chain Monte Carlo for Linear Mixed Models a dissertation submitted to the faculty of the graduate school of the university of minnesota by Felipe Humberto Acosta Archila in partial fulfillment of

More information

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods M. González, R. Martínez, I. del Puerto, A. Ramos Department of Mathematics. University of Extremadura Spanish

More information

SUPPLEMENT TO EXTENDING THE LATENT MULTINOMIAL MODEL WITH COMPLEX ERROR PROCESSES AND DYNAMIC MARKOV BASES

SUPPLEMENT TO EXTENDING THE LATENT MULTINOMIAL MODEL WITH COMPLEX ERROR PROCESSES AND DYNAMIC MARKOV BASES SUPPLEMENT TO EXTENDING THE LATENT MULTINOMIAL MODEL WITH COMPLEX ERROR PROCESSES AND DYNAMIC MARKOV BASES By Simon J Bonner, Matthew R Schofield, Patrik Noren Steven J Price University of Western Ontario,

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information