Geometric ergodicity of the Bayesian lasso

Size: px
Start display at page:

Download "Geometric ergodicity of the Bayesian lasso"

Transcription

1 Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components of are iid standard normal errors. Park and Casella [4] consider a Bayesian treatment of this model with a Laplace/Inverse-Gamma prior on (, ). They introduce a Data Augmentation approach that can be used to explore the resulting intractable posterior density, and call it the Bayesian lasso algorithm. In this paper, the Markov chain underlying the Bayesian lasso algorithm is shown to be geometrically ergodic, for arbitrary values of the sample size n and the number of variables p. This is important, as geometric ergodicity provides theoretical ustification for the use of Markov chain CLT, which can then be used to obtain asymptotic standard errors for Markov chain based estimates of posterior quantities. Kyung et al [] provide a proof of geometric ergodicity for the restricted case n p, but as we explain in this paper, their proof is incorrect. Our approach is different and more direct, and enables us to establish geometric ergodicity for arbitrary n and p. Keywords and Phrases: Convergence rate, Geometric drift condition, Markov chain, Monte Carlo. AMS Subect Classification: Primary 6J7; Secondary 6F5

2 Introduction Consider the standard linear model y = X +, where y =(y i ) n i= Rn is the vector of observations, X is the (known) design matrix, R p is the (unknown) vector of regression coefficients, the components of are iid standard normal errors, and is the (unknown) variance parameter. The obective is to estimate (, ). In a variety of modern datasets from genetics, finance, and environmental sciences, the dimension p of is much larger than the sample size n. An extremely popular approach to handle these kinds of datasets is the lasso, which was introduced in Tibshirani [9]. In this approach, the estimate of is obtained as ˆ lasso = argmin R p(y X ) T (y X )+, () = where is a user-specified tuning parameter. Note that the estimate in () can be regarded as the posterior mode of (conditional on ) if one puts independent Laplace priors on the entries of. Based on this observation, several authors proposed a Bayesian analysis using a Laplace-like prior for (see for example [3, 7, ]). Park and Casella [4] construct the following hierarchical Bayesian model as a Bayesian parallel/alternative to the lasso. y,, N n (X, I n ) (), N p ( p, D ), where D = diag(,,..., p ) (3) Inverse-Gamma(, ) (allow for impropriety via =or = ) (4) i.i.d Exponential for =,...,p (5) To see the connection with lasso, note that the prior density of given is given by f( )= py i= e.

3 Indeed, it is shown in [4] that the full conditional distributions of,, are given by,, y N p (X T X + D ) X T y, s,, y Inverse-Gaussian,, y Inverse-Gamma n + p +, (X T X + D ) (6)!, independently for =,,,p (7), (y X )T (y X )+ T D +. (8) Note that the Inverse-Gaussian(µ, ) density is given by f(x) = r x 3 e (x µ ) (µ ) x. Using the full conditional distributions specified above, one can construct a systematic scan Gibbs sampling algorithm to sample from the oint posterior density of (,, ) (and hence to obtain the desired sample from the posterior density of (, )). Let {( m, m, m )} m= be a Markov chain (with state space Rp R p + R +) whose dynamics are defined (implicitly) through the following three-step procedure for moving from the current state, ( n, n, n ), to ( n+, n+, n+ ). Iteration n +of Park and Casella s Bayesian lasso Gibbs sampler. Draw n+ from the conditional distribution in (8) given ( n, n, y).. Draw n+ from the conditional distribution in (7) given ( n, n+, y). 3. Draw n+ from the conditional distribution in (6) given ( n+, n+, y). In Section, the Markov transition density (Mtd) of the resulting Gibbs Markov chain, {( m, m, m )} m=, is defined and then used to establish that the chain is well behaved (i.e., Harris ergodic) and converges to the target posterior distribution. Thus, we can use this chain to construct strongly consistent estimators of intractable posterior expectations. To be specific, for q>, let L q ( ) denote the set of functions g : R p R p + R +! R such that Z Z E g q := R p R n + Z g(,, R + ) q (, y) d d d <, 3

4 where (,, y) represents the oint posterior density evaluated at (,, ). Harris ergodicity implies that, if g L ( ), then the estimator g m := m P m i= g( m, m, m ) is strongly consistent for E g, no matter how the chain is started. Of course, in practice, an estimator is only useful if it is possible to compute an associated standard error. All available methods of computing a valid asymptotic standard error for g m are based on the existence of a central limit theorem (CLT) for g m ; that is, we require that p m gm E g d! N(, ), for some positive, finite. Unfortunately, even if g L q ( ) for all q>, Harris ergodicity is not enough to guarantee the existence of such a CLT (see for example [6, 7]). The standard method of establishing the existence of CLTs is to prove that the underlying Markov chain converges at a geometric rate. Let B(X) denote the Borel sets in X := R p R p + R +, and let K m : X B(X)! [, ] denote the m- step Markov transition function of the Gibbs Markov chain. That is, K m (,, ),A is the probability that ( m, m, m ) A, given that the chain is started at (,, ). Also, let ( ) denote the oint posterior distribution. The chain is called geometrically ergodic if there exist a function M : X! [, ) and a constant [, ) such that, for all (,, ) X and all m =,,..., we have K m (,, ), ) ( ) TV apple M(,, ) m, where k k TV denotes the total variation norm. The relationship between geometric convergence and CLTs is simple: If the chain is geometrically ergodic and E g + < for some >, then g m satisfies a CLT. Moreover, because the Mtd is strictly positive on X (see Section ), the same + moment condition implies that the usual estimators of the asymptotic variance,, are consistent [, 8, 9, ]. Our main result, which is proven in Section 3 using a geometric drift condition and an associated minorization condition, is the following. Proposition. The Bayesian lasso Gibbs Markov chain is geometrically ergodic for n 3 and arbitrary p, X,. 4

5 Hence, the Markov chain CLT holds and can be used to obtain asymptotic standard errors of posterior estimates. In the restricted case when n p, Kyung et al. [] contains, among other results, a proof of geometric ergodicity of the Bayesian lasso Gibbs Markov chain. However, there are serious errors in this proof, which essentially arise from their assertion that the Bayesian lasso model in ()-(5) can be obtained as a marginal model of a hierarchical random effects model. A detailed explanation of the problems with the proof are provided in the appendix. The Bayesian lasso Markov chain We first provide the expressions for the full posterior conditional densities of, and respectively. Let f(,, y) denote the full posterior conditional density of (on R p ) given,, y. By (6), f(,, y) = XT X + D p p e ( (X T X+D ) X T y) T (X T X+D )( (XT X+D ) X T y). Let f(,, y) denote the full posterior conditional density of (on R p + ) given,, y. By (7), f(,, y) = py = r ( ) s A. Note that the reciprocal of the entries of (and not the entries themselves) have an Inverse-Gaussian distribution. Let f(,, y) denote the full posterior conditional density of (on R + ) given,, y. By (8), f(,, y) = (y X )T (y X )+ T D n+p+ n+p+ + n+p+ ( ) n+p+ e (y X )T (y X )+ T D +. Let denote the Lebesgue measure on R p R p + R +. The Bayesian lasso Gibbs Markov chain has a Mtd (with respect to ) given by k (,, ) (,, ) = f(,, y) f(,, y) f(,, y). (9) 5

6 It is well known, and can be verified by a straightforward calculation, that the oint posterior density of (,, ) is invariant for the Gibbs transition density k defined above. The Mtd is strictly positive, which implies that the chain is aperiodic and -irreducible [3, Page 87]. Moreover, the existence of an invariant probability density together with -irreducibility imply that the chain is positive Harris recurrent (see for example []). Note also that is equivalent to the maximal irreducibility measure. We now prove geometric ergodicity by establishing a geometric drift condition and an associated minorization condition for the Gibbs transition density k. 3 Proof of geometric ergodicity 3. Drift condition Consider the function V (,, )=(y X ) T (y X )+ T D +. = Let E k,, represent the expectation with respect to one step of the Markov chain with transition density k, starting at (,, ). The following proposition establishes a geometric drift condition for the transition density k. Proposition. If n 3, there exist constants apple < and b> such that E k V (,, ),, apple V (,, )+b, () for every (,, ) R p R p + R +. Proof: It follows by the definition of the Markov transition density k that E k V (,, ),, = E E E V (,, ),,,. () 6

7 We evaluate the three conditional expectations one step at a time. We start with the innermost conditional expectation in (). E V (,, ), = E (y X ) T (y X )+ T D, = y T y + E T (X T X + D ), + = y T XE, + = = y T y + (X T X + D ) (X T X + D ) X T y + tr (X T X + D ) (X T X + D ) (X T X + D ) y T X(X T X + D ) X T y + = = y T y y T X(X T X + D ) X T y + p + apple y T y + + p = = () For the next step, we analyze terms related to the middle conditional expectation in (). Note that if Z Inverse-Gaussian(µ, ), then apple E Z = µ +. Hence, it follows from (7) that E, = = s s + n + p + (n + p + ) + apple n + p + + (n + p + ) +. (3) Finally, we analyze terms related to the outermost conditional expectation in (). Note that E, (y X = ) T (y X )+ T D +, (4) n + p + 7

8 and apple E, = Combining ()-(5), we get that apple apple apple apple E k V (,, n + p + (y X ) T (y X )+ T D +. (5) ),, y T p(n + p + ) y + + p + p (y X ) T (y X )+ T D + + n + p + (n + p + ) P p = (n + p + ) (y X ) T (y X )+ T D + y T p(n +p + ) y + + p + p n + p + + p n + p + (y X ) T (y X )+ T D + P p = (y X ) T (y X )+ T D + y T p(n +p + ) y + + p + p n + p + + p n + p + (y X ) T (y X )+ T D + y T y + p(n +p + ) + p + p n + p + + p n + p + (y X ) T (y X )+ T D + The last inequality follows from the fact that It follows from (6) that = apple @ A = s = = q = A A. P p = T D (6) = E k V (,, ),, apple V (,, )+b, (7) 8

9 where and b = y T y + = max p n + p +,, (8) p(n +p + ) + p + p n + p +. (9) Hence, the required geometric drift condition is established. 3. Minorization condition Let us recall that V (,, )=(y X ) T (y X )+ T D + P p =. For every d>, let B V,d = {(,, ):V (,, ) apple d}. The following proposition establishes an associated minorization condition to the geometric drift condition established in Proposition. Proposition 3. There exists a constant < = (V,d) apple and a probability density function f on R p R p + R + such that k (,, ) (,, ) f(,, ), () for every (,, ) B V,d. Proof: Note that r ( µ ) f(,, y) = ( ) e (µ ) = r ( ) e (µ ) + µ r ( ) e (µ ), 9

10 where = and µ = r. If (,, ) B V,d, then ) ) T D + T D = apple d. apple d = A apple d Hence, if (,, and, ) B V,d, then (µ ) = apple d, f(,, y) r r ( ) e ( ) e g ( )e d d q d + e q d q d, () where g ( ) is the density of the reciprocal of an Inverse-Gaussian random variable with parameters q d and. Let us now consider the full conditional density of given,, y. Note that for (,, ) B V,d, (y X ) T (y X )+ T D + apple d +, ()

11 and (y X ) T (y X )+ T D + = y T y y T X + T X T X + D + = y T y y T X X T X + D X T y + X T X + D X T y X T X + D T X T X + D X T y X T X + D + y T y y T X X T X + D X T y + y T y y T X X T X + p d I X T y +. (3) The last inequality follows since apple d for every apple apple p. Note that I n X X T X + d I p X T is a positive definite matrix. Hence y T y for (,, ) B V,d, y T X X T X + d I p X T y >. It follows from () and (3) that = e e p q d f(,, y) y T y y T y y T y y T X X T X + d I p y T X X T X + d I p X T y + X T y +! n+p+! n+p+ y T X X T X + d I p X T y + d + + p d! n+p+ q ( ) n+p+ + e d+ e p d n+p+ ( ) n+p+ + e d+ e n+p+ p d g ( ), (4) where g ( ) is the density of the Inverse-Gamma distribution with parameters n+p+ and d+ + d. It follows from (9), () and (4) that, for all (,, ) B V,d, k (,, ) (,, ) = f(, f(,,, y)f(, ),, y)f(,, y)

12 where = e y T y y T X X T X + d I p X T y + d + + p d! n+p+, (5) and f is a probability density on R p R p + R + given by py f(,, )=f(,, g ( = ) A g ( ). Hence, the required minorization condition for k has been established. The drift and minorization conditions in Proposition and Proposition 3 can be combined with Theorem of Rosenthal [8] to establish geometric ergodicity of the Bayesian lasso Gibbs Markov chain. Proposition 4. Let n 3, and,b and be as defined in (8), (9), and (5) respectively. Let d> b. Let A = +d +b + d and U = + ( d + b). Then for any (,, ) R p R p + R +, m N, and <r< K m (,, U ), ( ) TV apple ( ) rm r m + A r + b + V (,, ). Proposition is an immediate corollary of the above result. 4 Discussion The Bayesian lasso is a popular and widely used algorithm for sparse Bayesian estimation in linear regression. In this paper, we have established geometric ergodicity of the Markov chain corresponding to the Bayesian lasso algorithm. As discussed previously, this proves the existence of the Markov chain CLT in this context, thereby allowing the user to obtain valid asymptotic standard errors for Markov chain based estimates of posterior quantities. The Laplace prior for used in the Bayesian lasso algorithm is an example of the so-called continuous shrinkage priors for sparse Bayesian estimation. In recent years, many other continuous shrinkage priors

13 have been introduced and studied in the literature (see [4, 5, 6,, 5] and the references therein), and sampling from the resulting posterior distribution is often achieved by using MCMC. To the best of our knowledge, geometric ergodicity of the corresponding Markov chains has not been investigated. Using the insights obtained from the analysis in this paper, we are currently investigating geometric ergodicity for Markov chains corresponding to some of the other continuous shrinkage priors proposed in the literature. Appendix We point out the errors in the proof of geometric ergodicity of the Bayesian lasso Gibbs Markov chain in Kyung et al [] (referred to henceforth as KGGC). The authors in KGGC only consider the case n p. The authors assert that the Bayesian lasso model can be obtained by appropriately marginalizing a hierarchical random effects model. They then prove the geometric ergodicity of the Markov chain corresponding to this hierarchical random effects model, and argue that this implies geometric ergodicity of the Bayesian lasso model. We start by reproducing (verbatim) the hierarchical random effects model from KGGC (Page 46, appendix) below. y p N (, e I) (6) N (µ, ( )) (7) µ N (µ, ) (8) e Gamma(a,b ) (9) i.i.d. Gamma(a,b ), (3) where ( ) is of the form ( )=diag(,,, p ) py i= e d. (3) We refer to this hierarchical random effects model as the RE model. 3

14 There is a contradiction between (3) and (3) regarding the prior distribution of. In (3), is assumed to be Inverse-Gamma, whereas in (3), is assumed to be Exponential. The authors proceed to prove geometric ergodicity of the RE model under the Inverse-Gamma assumption in (3). Since the Bayesian lasso model assumes the prior for to be Exponential, it clearly cannot be obtained by marginalizing the RE model with the Inverse-Gamma assumption in (3). Hence, the proof of geometric ergodicity for the RE model with the Inverse-Gamma assumption has no bearing on the geometric ergodicity for the Bayesian lasso model. y is assumed to be p-dimensional in (6). In the Bayesian lasso model, the data vector y is n- dimensional. Since the authors are considering the case n>p, this clearly rules out the possibility of marginalizing the RE model to get the Bayesian lasso model. If this is a typo, and y is n-dimensional, then is n-dimensional. Hence, ( ) is an n n matrix. However, ( ) is assumed to be a p p matrix in (3). There is another error in the marginalizing argument provided in KGGC. On Page 38 (Section 4) the authors consider the model y, N n (, I n ) and N n (µ n, ), (3) where = J + X(X T X) D (X T X) X T + ˆX ˆX T, and = µ n + X + ˆX (with n,x, ˆX mutually orthogonal). Here n is the n-dimensional vector of all ones, and J is the n n matrix of all ones. They claim that marginalizing over in (3) gives the marginal model y N n (µ n + X, I n ) and N p p, D. (33) Firstly, there is a small issue in terms of µ. Note that, by the mutual orthogonality of n,x, ˆX, T n n = T n (µ n + X + ˆX ) = µ. n 4

15 Since N n (µ n, ), this implies µ N µ, T n n n, which is paradoxical, as it implies that µ is a random variable with non-zero variance ( is a full rank matrix), and also implies that µ is a constant. Let us get rid of this discrepancy by assuming µ =, = X(X T X) D (X T X) X T + ˆX ˆX T, and choosing the transformation! X + ˆX, where is now (n p)-dimensional, and X T ˆX =. Let us assume without loss of generality that ˆX T ˆX = In p (the choice of ˆX is completely in our hands anyway). Still, the marginal model after integrating out is not the same as (33). An immediate way to check this is that V (y) = I n + under (3), while V (y) = I n + XD X T under (33). We now derive the correct marginal model. Note that the oint density of (y, ) is given by f(y, ) = ( p ) e (y ) T (y ) n ( p ) e n T. By making the linear transformation (y, )! (y,, ), we get that f(y,, ) = const e (X + ˆX ) T ( X(X T X) (y X ˆX ) T (y X ˆX ) e = const e (+ ) T + T ˆX T y D (XT X) X T + ˆX ˆXT )(X + ˆX ) e (y X )T (y X )+ T D. After completing the square, and integrating out, the marginal density of (y, ) is obtained as f(y, ) = const e y T ˆX ˆXT y ( +) e (y X )T (y X )+ T D. 5

16 By completing the squares again, and using ˆX T X =, it follows that the marginal model is given by y, N n (X, I n + ˆX ˆX T ) and N p p, D. This is different from the Bayesian lasso model, as the conditional variance of y given, is I n + ˆX ˆX T (as opposed to I n in the Bayesian lasso model). Acknowledgment. The first author was supported by NSF Grant DMS--684, and the second by NSF Grant DMS References [] Asmussen, S. and Glynn, P. W. (). A new proof of convergence of MCMC via the ergodic theorem, Statistics & Probability Letters 8, [] Bednorz, W. and Latuszynski, K. (7). A few remarks on Fixed-Width Output Analysis for Markov Chain Monte Carlo by Jones et al, JASA, [3] Bae, K., and Mallick, B. K. (4). Gene selection using a two-level hierarchical Bayesian model, Bioinformatics, [4] Bhattacharya, A., Pati, D., Pillai, N.S. and Dunson, D.B. (). Bayesian shrinkage, arxiv preprint. [5] Carvalho, C.M., Polson, N.G. and Scott, J.G. (9). Handling sparsity via the horseshoe, Journal of Machine Learning Research W & CP 5, [6] Carvalho, C.M., Polson, N.G. and Scott, J.G. (). The horseshoe estimator for sparse signals. Biometrika 97, [7] Figueiredo, M. A. T. (3). Adaptive sparseness for supervised learning, IEEE Transactions on Pattern Analysis and Machine Intelligence 5,

17 [8] Flegal, J.M., Haran, M. and Jones, G.M. (8). Markov chain Monte Carlo: Can we trust the third significant figure?, Statistical Science 3, 5-6. [9] Flegal, J.M. and Jones, G.L. (). Batch means and spectral variance estimators in Markov chain Monte Carlo, Annals of Statistics 38, [] Griffin, J.E. and Brown, P.J. (). Inference with normal-gamma prior distributions in regression problems, Bayesian Analysis 5, 788. [] Jones, G.L., Haran, M., Caffo, B.S. and Neath, R. (6). Fixed-width output analysis for Markov chain Monte Carlo, Journal of the American Statistical Association, [] Kyung, M., Gill, J., Ghosh, M. and Casella, G. (). Penalized regression, standard errors and Bayesian lassos, Bayesian Analysis 5, [3] Meyn, S.P. and Tweedie, R.L. (993). Markov Chains and Stochastic Stability, Springer-Verlag, London. [4] Park, T. and Casella, G. (8). The Bayesian lasso, Journal of the American Statistical Association 3, [5] Polson, N.G. and Scott, J.G. (). Shrink globally, act locally: Sparse Bayesian regularization and prediction, Bayesian Statistics 9, J.M. Bernardo, M.J. Bayarri, J.O. Berger, A.P. Dawid, D. Heckerman, A.F.M. Smith and M. West, eds., Oxford University Press, New York. [6] Roberts, G.O. and Rosenthal, J.S. (998). Markov chain Monte Carlo: Some practical implications of theoretical results (with discussion), Canadian Journal of Statistics 6, 5-3. [7] Roberts, G.O. and Rosenthal, J.S. (4). General state space Markov chains and MCMC algorithms, Probability Surveys, -7. 7

18 [8] Rosenthal, J.S. (995). Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo, JASA 9, [9] Tibshirani, R. (996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society B 58, [] Yuan, M., and Lin, Y. (5). Efficient Empirical Bayes Variable Selection and Esimation in Linear Models, Journal of the American Statistical Association, 55. 8

The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic

The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic he Polya-Gamma Gibbs Sampler for Bayesian Logistic Regression is Uniformly Ergodic Hee Min Choi and James P. Hobert Department of Statistics University of Florida August 013 Abstract One of the most widely

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance

Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance Electronic Journal of Statistics Vol. (207) 326 337 ISSN: 935-7524 DOI: 0.24/7-EJS227 Analysis of Polya-Gamma Gibbs sampler for Bayesian logistic analysis of variance Hee Min Choi Department of Statistics

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1 Recent progress on computable bounds and the simple slice sampler by Gareth O. Roberts* and Jerey S. Rosenthal** (May, 1999.) This paper discusses general quantitative bounds on the convergence rates of

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Norm Comparisons for Data Augmentation by James P. Hobert Department of Statistics University of Florida and Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0704

More information

Scalable MCMC for the horseshoe prior

Scalable MCMC for the horseshoe prior Scalable MCMC for the horseshoe prior Anirban Bhattacharya Department of Statistics, Texas A&M University Joint work with James Johndrow and Paolo Orenstein September 7, 2018 Cornell Day of Statistics

More information

Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration

Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration Block Gibbs sampling for Bayesian random effects models with improper priors: Convergence and regeneration Aixin Tan and James P. Hobert Department of Statistics, University of Florida May 4, 009 Abstract

More information

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Spectral properties of Markov operators in Markov chain Monte Carlo

Spectral properties of Markov operators in Markov chain Monte Carlo Spectral properties of Markov operators in Markov chain Monte Carlo Qian Qin Advisor: James P. Hobert October 2017 1 Introduction Markov chain Monte Carlo (MCMC) is an indispensable tool in Bayesian statistics.

More information

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Willem van den Boom Department of Statistics and Applied Probability National University

More information

CONVERGENCE ANALYSIS OF THE GIBBS SAMPLER FOR BAYESIAN GENERAL LINEAR MIXED MODELS WITH IMPROPER PRIORS. Vanderbilt University & University of Florida

CONVERGENCE ANALYSIS OF THE GIBBS SAMPLER FOR BAYESIAN GENERAL LINEAR MIXED MODELS WITH IMPROPER PRIORS. Vanderbilt University & University of Florida Submitted to the Annals of Statistics CONVERGENCE ANALYSIS OF THE GIBBS SAMPLER FOR BAYESIAN GENERAL LINEAR MIXED MODELS WITH IMPROPER PRIORS By Jorge Carlos Román and James P. Hobert Vanderbilt University

More information

Applicability of subsampling bootstrap methods in Markov chain Monte Carlo

Applicability of subsampling bootstrap methods in Markov chain Monte Carlo Applicability of subsampling bootstrap methods in Markov chain Monte Carlo James M. Flegal Abstract Markov chain Monte Carlo (MCMC) methods allow exploration of intractable probability distributions by

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Kobe University Repository : Kernel

Kobe University Repository : Kernel Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI URL Note on the Sampling Distribution for the Metropolis-

More information

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics

Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Minicourse on: Markov Chain Monte Carlo: Simulation Techniques in Statistics Eric Slud, Statistics Program Lecture 1: Metropolis-Hastings Algorithm, plus background in Simulation and Markov Chains. Lecture

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Bayesian Sparse Linear Regression with Unknown Symmetric Error

Bayesian Sparse Linear Regression with Unknown Symmetric Error Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of

More information

Package horseshoe. November 8, 2016

Package horseshoe. November 8, 2016 Title Implementation of the Horseshoe Prior Version 0.1.0 Package horseshoe November 8, 2016 Description Contains functions for applying the horseshoe prior to highdimensional linear regression, yielding

More information

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics arxiv:1603.08163v1 [stat.ml] 7 Mar 016 Farouk S. Nathoo, Keelin Greenlaw,

More information

On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo

On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo On the Applicability of Regenerative Simulation in Markov Chain Monte Carlo James P. Hobert 1, Galin L. Jones 2, Brett Presnell 1, and Jeffrey S. Rosenthal 3 1 Department of Statistics University of Florida

More information

Convergence Analysis of MCMC Algorithms for Bayesian Multivariate Linear Regression with Non-Gaussian Errors

Convergence Analysis of MCMC Algorithms for Bayesian Multivariate Linear Regression with Non-Gaussian Errors Convergence Analysis of MCMC Algorithms for Bayesian Multivariate Linear Regression with Non-Gaussian Errors James P. Hobert, Yeun Ji Jung, Kshitij Khare and Qian Qin Department of Statistics University

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

An ABC interpretation of the multiple auxiliary variable method

An ABC interpretation of the multiple auxiliary variable method School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2016-07 27 April 2016 An ABC interpretation of the multiple auxiliary variable method by Dennis Prangle

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu 1,2, Daniel F. Schmidt 1, Enes Makalic 1, Guoqi Qian 2, John L. Hopper 1 1 Centre for Epidemiology and Biostatistics,

More information

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal*

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal* Analysis of the Gibbs sampler for a model related to James-Stein estimators by Jeffrey S. Rosenthal* Department of Statistics University of Toronto Toronto, Ontario Canada M5S 1A1 Phone: 416 978-4594.

More information

Faithful couplings of Markov chains: now equals forever

Faithful couplings of Markov chains: now equals forever Faithful couplings of Markov chains: now equals forever by Jeffrey S. Rosenthal* Department of Statistics, University of Toronto, Toronto, Ontario, Canada M5S 1A1 Phone: (416) 978-4594; Internet: jeff@utstat.toronto.edu

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS

The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS The Pennsylvania State University The Graduate School RATIO-OF-UNIFORMS MARKOV CHAIN MONTE CARLO FOR GAUSSIAN PROCESS MODELS A Thesis in Statistics by Chris Groendyke c 2008 Chris Groendyke Submitted in

More information

arxiv: v1 [math.st] 4 Dec 2015

arxiv: v1 [math.st] 4 Dec 2015 MCMC convergence diagnosis using geometry of Bayesian LASSO A. Dermoune, D.Ounaissi, N.Rahmania Abstract arxiv:151.01366v1 [math.st] 4 Dec 015 Using posterior distribution of Bayesian LASSO we construct

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Quantitative Non-Geometric Convergence Bounds for Independence Samplers

Quantitative Non-Geometric Convergence Bounds for Independence Samplers Quantitative Non-Geometric Convergence Bounds for Independence Samplers by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 28; revised July 29.) 1. Introduction. Markov chain Monte Carlo (MCMC)

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Lasso & Bayesian Lasso

Lasso & Bayesian Lasso Readings Chapter 15 Christensen Merlise Clyde October 6, 2015 Lasso Tibshirani (JRSS B 1996) proposed estimating coefficients through L 1 constrained least squares Least Absolute Shrinkage and Selection

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

Partial factor modeling: predictor-dependent shrinkage for linear regression

Partial factor modeling: predictor-dependent shrinkage for linear regression modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework

More information

Markov Chain Monte Carlo: Can We Trust the Third Significant Figure?

Markov Chain Monte Carlo: Can We Trust the Third Significant Figure? Markov Chain Monte Carlo: Can We Trust the Third Significant Figure? James M. Flegal School of Statistics University of Minnesota jflegal@stat.umn.edu Murali Haran Department of Statistics The Pennsylvania

More information

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods

Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods Non-Parametric Bayesian Inference for Controlled Branching Processes Through MCMC Methods M. González, R. Martínez, I. del Puerto, A. Ramos Department of Mathematics. University of Extremadura Spanish

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling

Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling Improving the Convergence Properties of the Data Augmentation Algorithm with an Application to Bayesian Mixture Modelling James P. Hobert Department of Statistics University of Florida Vivekananda Roy

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,

More information

Coupled Hidden Markov Models: Computational Challenges

Coupled Hidden Markov Models: Computational Challenges .. Coupled Hidden Markov Models: Computational Challenges Louis J. M. Aslett and Chris C. Holmes i-like Research Group University of Oxford Warwick Algorithms Seminar 7 th March 2014 ... Hidden Markov

More information

A Comparison Theorem for Data Augmentation Algorithms with Applications

A Comparison Theorem for Data Augmentation Algorithms with Applications A Comparison Theorem for Data Augmentation Algorithms with Applications Hee Min Choi Department of Statistics University of California, Davis James P. Hobert Department of Statistics University of Florida

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Kernel Sequential Monte Carlo

Kernel Sequential Monte Carlo Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section

More information

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains A regeneration proof of the central limit theorem for uniformly ergodic Markov chains By AJAY JASRA Department of Mathematics, Imperial College London, SW7 2AZ, London, UK and CHAO YANG Department of Mathematics,

More information

Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo. (September, 1993; revised July, 1994.)

Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo. (September, 1993; revised July, 1994.) Minorization Conditions and Convergence Rates for Markov Chain Monte Carlo September, 1993; revised July, 1994. Appeared in Journal of the American Statistical Association 90 1995, 558 566. by Jeffrey

More information

On the Half-Cauchy Prior for a Global Scale Parameter

On the Half-Cauchy Prior for a Global Scale Parameter Bayesian Analysis (2012) 7, Number 2, pp. 1 16 On the Half-Cauchy Prior for a Global Scale Parameter Nicholas G. Polson and James G. Scott Abstract. This paper argues that the half-cauchy distribution

More information

Bisection Ideas in End-Point Conditioned Markov Process Simulation

Bisection Ideas in End-Point Conditioned Markov Process Simulation Bisection Ideas in End-Point Conditioned Markov Process Simulation Søren Asmussen and Asger Hobolth Department of Mathematical Sciences, Aarhus University Ny Munkegade, 8000 Aarhus C, Denmark {asmus,asger}@imf.au.dk

More information

Variance Bounding Markov Chains

Variance Bounding Markov Chains Variance Bounding Markov Chains by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised April 2007.) Abstract. We introduce a new property of Markov chains, called variance bounding.

More information

Comment on Article by Scutari

Comment on Article by Scutari Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs

More information

ELEC633: Graphical Models

ELEC633: Graphical Models ELEC633: Graphical Models Tahira isa Saleem Scribe from 7 October 2008 References: Casella and George Exploring the Gibbs sampler (1992) Chib and Greenberg Understanding the Metropolis-Hastings algorithm

More information

VARIABLE TRANSFORMATION TO OBTAIN GEOMETRIC ERGODICITY IN THE RANDOM-WALK METROPOLIS ALGORITHM

VARIABLE TRANSFORMATION TO OBTAIN GEOMETRIC ERGODICITY IN THE RANDOM-WALK METROPOLIS ALGORITHM Submitted to the Annals of Statistics VARIABLE TRANSFORMATION TO OBTAIN GEOMETRIC ERGODICITY IN THE RANDOM-WALK METROPOLIS ALGORITHM By Leif T. Johnson and Charles J. Geyer Google Inc. and University of

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Asymptotic efficiency of simple decisions for the compound decision problem

Asymptotic efficiency of simple decisions for the compound decision problem Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com

More information

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel

The Bias-Variance dilemma of the Monte Carlo. method. Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel The Bias-Variance dilemma of the Monte Carlo method Zlochin Mark 1 and Yoram Baram 1 Technion - Israel Institute of Technology, Technion City, Haifa 32000, Israel fzmark,baramg@cs.technion.ac.il Abstract.

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

Rank Regression with Normal Residuals using the Gibbs Sampler

Rank Regression with Normal Residuals using the Gibbs Sampler Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters

More information

Bayesian Methods with Monte Carlo Markov Chains II

Bayesian Methods with Monte Carlo Markov Chains II Bayesian Methods with Monte Carlo Markov Chains II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University hslu@stat.nctu.edu.tw http://tigpbp.iis.sinica.edu.tw/courses.htm 1 Part 3

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS The Annals of Applied Probability 1998, Vol. 8, No. 4, 1291 1302 ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS By Gareth O. Roberts 1 and Jeffrey S. Rosenthal 2 University of Cambridge

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

Surveying the Characteristics of Population Monte Carlo

Surveying the Characteristics of Population Monte Carlo International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 7 (9): 522-527 Science Explorer Publications Surveying the Characteristics of

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Scalable Bayesian shrinkage and uncertainty quantification for high-dimensional regression

Scalable Bayesian shrinkage and uncertainty quantification for high-dimensional regression Scalable Bayesian shrinkage and uncertainty quantification for high-dimensional regression arxiv:1509.03697v [stat.me] 19 Apr 017 Bala Rajaratnam Department of Statistics, University of California, Davis

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling

Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Monte Carlo Methods Appl, Vol 6, No 3 (2000), pp 205 210 c VSP 2000 Factorization of Seperable and Patterned Covariance Matrices for Gibbs Sampling Daniel B Rowe H & SS, 228-77 California Institute of

More information

A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models

A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models A Note on the comparison of Nearest Neighbor Gaussian Process (NNGP) based models arxiv:1811.03735v1 [math.st] 9 Nov 2018 Lu Zhang UCLA Department of Biostatistics Lu.Zhang@ucla.edu Sudipto Banerjee UCLA

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

Sparse Bayesian Nonparametric Regression

Sparse Bayesian Nonparametric Regression François Caron caronfr@cs.ubc.ca Arnaud Doucet arnaud@cs.ubc.ca Departments of Computer Science and Statistics, University of British Columbia, Vancouver, Canada Abstract One of the most common problems

More information

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract

Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies. Abstract Bayesian Estimation of A Distance Functional Weight Matrix Model Kazuhiko Kakamu Department of Economics Finance, Institute for Advanced Studies Abstract This paper considers the distance functional weight

More information

Iterative Markov Chain Monte Carlo Computation of Reference Priors and Minimax Risk

Iterative Markov Chain Monte Carlo Computation of Reference Priors and Minimax Risk Iterative Markov Chain Monte Carlo Computation of Reference Priors and Minimax Risk John Lafferty School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 lafferty@cs.cmu.edu Abstract

More information

A generalization of Dobrushin coefficient

A generalization of Dobrushin coefficient A generalization of Dobrushin coefficient Ü µ ŒÆ.êÆ ÆÆ 202.5 . Introduction and main results We generalize the well-known Dobrushin coefficient δ in total variation to weighted total variation δ V, which

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Hyperparameter estimation in Dirichlet process mixture models

Hyperparameter estimation in Dirichlet process mixture models Hyperparameter estimation in Dirichlet process mixture models By MIKE WEST Institute of Statistics and Decision Sciences Duke University, Durham NC 27706, USA. SUMMARY In Bayesian density estimation and

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

Posterior Model Probabilities via Path-based Pairwise Priors

Posterior Model Probabilities via Path-based Pairwise Priors Posterior Model Probabilities via Path-based Pairwise Priors James O. Berger 1 Duke University and Statistical and Applied Mathematical Sciences Institute, P.O. Box 14006, RTP, Durham, NC 27709, U.S.A.

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

Local consistency of Markov chain Monte Carlo methods

Local consistency of Markov chain Monte Carlo methods Ann Inst Stat Math (2014) 66:63 74 DOI 10.1007/s10463-013-0403-3 Local consistency of Markov chain Monte Carlo methods Kengo Kamatani Received: 12 January 2012 / Revised: 8 March 2013 / Published online:

More information

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL COMMUN. STATIST. THEORY METH., 30(5), 855 874 (2001) POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL Hisashi Tanizaki and Xingyuan Zhang Faculty of Economics, Kobe University, Kobe 657-8501,

More information

Confidence intervals for the mixing time of a reversible Markov chain from a single sample path

Confidence intervals for the mixing time of a reversible Markov chain from a single sample path Confidence intervals for the mixing time of a reversible Markov chain from a single sample path Daniel Hsu Aryeh Kontorovich Csaba Szepesvári Columbia University, Ben-Gurion University, University of Alberta

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10 COS53: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 0 MELISSA CARROLL, LINJIE LUO. BIAS-VARIANCE TRADE-OFF (CONTINUED FROM LAST LECTURE) If V = (X n, Y n )} are observed data, the linear regression problem

More information