arxiv: v1 [math.st] 22 Feb 2018

Size: px
Start display at page:

Download "arxiv: v1 [math.st] 22 Feb 2018"

Transcription

1 Bayesian Lasso : Concentration and MCMC Diagnosis arxiv:18.857v1 [math.st] Feb 18 D.Ounaissi 1, N.Rahmania 1 Laboratoire LAMA, UFR de Mathématiques, Université de Paris 1, France. Laboratoire Paul Painlevé, UFR de Mathématiques, Université de Lille, France. Abstract Using posterior distribution of Bayesian LASSO we construct a semi-norm on the parameter space. We show that the partition function depends on the ratio of the l 1 and l norms and present three regimes. We derive the concentration of Bayesian LASSO, and present MCMC convergence diagnosis. Keywords: LASSO, Bayes, MCMC, log-concave, geometry, incomplete Gamma function 1. Introduction Let p n be two positive integers, y R n and A be an n p matrix with real numbers entries. Bayesian LASSO c(x) = 1 ( Z exp Ax ) y x 1 (1) is a typically posterior distribution used in the linear regression Here ( Z = exp R p y = Ax + w. Ax y x 1 )dx () is the partition function, and 1 are respectively the Euclidean and the l 1 norms. The vector y R n are the observations, x R p is the addresses: daoud.ounaissi@u-pec.fr (D.Ounaissi), nadji.rahmania@univ-lille1.fr (N.Rahmania) 1 Corresponding Author Preprint submitted to Elsevier November 8, 18

2 unknown signal to recover, w R n is the standard Gaussian noise, and A is a known matrix which maps the signal domain R p into the observation domain R n. If we suppose that x is drawn from Laplace distribution i.e. the distribution proportional to exp( x 1 ), (3) then the posterior of x known y is drawn from the distribution c (1). The mode { Ax y } arg min + x 1 : x R p (4) of c was first introduced in [18] and called LASSO. It is also called Basis Pursuit De-Noising method [4]. In our work we select the term LASSO and keep it for the rest of the article. In general LASSO is not a singleton, i.e. the mode of the distribution c is not unique. In this case LASSO is a set and we will denote by lasso any element of this set. A large number of theoretical results has been provided for LASSO. See [5], [6], [9], [1], [15] and the references herein. The most popular algorithms to find LASSO are LARS algorithm [8], ISTA and FISTA algorithms see e.g. [] and the review article [14]. The aim of this work is to study geometry of bayesian LASSO and to derive MCMC convergence diagnosis.. Polar integration Using polar coordinates s = x S, r = x, the partition function () x Z p = Z p (s)ds, (5) S where denotes one of l or l 1 norms in R p, ds denotes the surface measure on the unit sphere S of the norm, and Z p (s) = exp{ f(rs)}r p 1 dr, (6) where f(x) := Ax y + x 1, x R p. We express the partition function (6) using the parabolic cylinder function. We also give an inequality of concentration and a geometric interpretation of the partition function Z p.

3 3. Parabolic cylinder function and partition function We extend the function s S Z p (s) x R p Z p (x) = exp{ f(rx)}r p 1 dr. (7) This extension is homogeneous of order p. If Ax =, then f(rx) = y + r x 1, and more if x, then Z p (x) = (p 1)! x p 1 exp( y ). If Ax, then we will express Z p (x) using the parabolic cylinder function. We recall that for b R, z the parabolic cylinder function is given by D b (z) = z b exp( z 4 )[1 + O(z )], (8) when z + [16]. We also recall the integral representation of Erdlyi [7] for the parabolic cylinder function exp( z 4 )Γ(ν)D ν(z) = exp( 1 r zr)r ν 1 dr, ν >, where Γ(ν) = exp( t)t ν 1 dt is the Γ function. Proposition 3.1. The variable ω lasso := x 1 Ax, y Ax (9) will play an important role. It depends only on s = x x 1 S p 1,1 and the function s S p 1,1 ω lasso (s) is bounded below by λ 1, := min{ 1 As,y As : s S p 1,1 }. Now we can announce the following result. Proposition 3.. We have for Ax Z p (x) = (p 1)! exp( y ) Ax p exp( ω lasso 4 )D p(ω lasso ). If Ax, then ω lasso + and Z p (x) = (p 1)! exp( y ) x p 1 [1 + O(ω lasso )]. 3 As

4 Corollary 3.3. If y = then ω lasso = 1 As is bounded below by 1 λ 1,, where λ 1, = max( As : s S p 1,1 ) is the norm of the operator A : (R p, 1 ) (R n, ). The partition fucntion Z p (s) = (p 1)!ω p lasso exp(ω lasso 4 )D p(ω lasso ) is As convex and decreasing. Proof 3.4. It suffices to remark that Z p (s) = exp{ As r r}r p 1 dr. 4. Geometric interpretation of the partition function First we represent f(rx) for Ax in the form f(rx) = y ω lasso + (r Ax + ω lasso ). (1) The function exp{ f(x)}, x R p is log-concav and integrable in R p. Observe that Z 1 p p [1] tells us that is a norm on the null space Ker(A) of A. A general result x R p Z 1 p p (x) := x c is a quasi-norm on R p. The unit ball of c is defined by B(A, y) := {x R p : x c 1} = {x R p : Z p (x) 1} = {x = rs R p : Z p (s) r p } = {x = rs R p : (p 1)! exp( y ) As p exp( ω lasso 4 )D p(ω lasso ) r p }. Its contour is equal to C(A, y) := {x R p : x c = 1} = {x R p : Z p (x) = 1} = {x = rs R p : Z p (s) = r p } = {x = rs R p : (p 1)! exp( y ) As p exp( ω lasso 4 )D p(ω lasso ) = r p }. We summarize our results in the following proposition. 4

5 Proposition ) For each s S p 1,1, the longest segment [, r]s contained in B(A, y) holds for r = r max (s) is solution of the equation r p = (p 1)! exp( y ) As p exp( ω lasso 4 )D p(ω lasso ). ) The ball B(A, y) = s S p 1,1 [, r max (s)]s, and its contour is equal to 3) The volume B(A, y) is C(A, y) = {r max (s)s : s S p 1,1 }. rmax(s) p S p 1,1 p ds = Z p p. 5. Necessary and sufficient condition to have lasso equal zero Now we can give the necessary and sufficient condition to have lasso = {} Proposition 5.1. The following assertions are equivalent. 1) = lasso. ) ω lasso (s) pour tout s S p 1,1. 3) A y Concentration around the lasso 6.1. The case lasso null The polar coordinate formula tells us that, we can draw a vector x from c(x)dx by drawing its angle s uniformly, and then simulate its distance r to the origin from c(r, s)dr = 1 Z p (s) exp{ f(rs)}rp 1 dr (11) 5

6 Now let s estimate for r > the probability P( x > r) = c(r, s)dr ds S, S r where S denotes the Lebesgue measure of S. We introduce for each pair a, b R p the function In the following g a,b,p (r) := g a,b (r) (p 1) ln(r), r >. (1) a = As, b = ω lasso. The function r g a,b (r) is increasing (because b := ω lasso ). The function r g a,b,p (r) f est convexe et atteint son minimum au point r(s) solution de l quation As (r As + ω lasso ) = p 1. r The positive root is given by On one hand r(s) = ω lasso + ωlasso + 4(p 1) (13) As exp{ g a,b,p (r)}dr exp{ g a,b (r(s))} r(s) r p 1 dr = exp{ g a,b,p (r(s))} r(s) p. On the other hand by using the convexity of r g a,b (r), we have for all r >, g a,b (r) g a,b (r(s)) + (p 1)(r r(s)), r(s) because r g a,b (r(s)) = p 1. We deduce for q >, r(s) qr(s) exp{ g a,b,p (r)}dr exp{ g a,b (r) + p 1} exp{ g a,b (r(s)) + p 1} qr(s) q(p 1) exp{ p 1 r(s) r}rp 1 dr exp( r)r p 1 r(s) dr (p 1) p exp{ g a,b (r(s)) + p 1} r(s) Γ(p, q(p 1)), (p 1) p 6

7 where Γ(ν, x) = exp( t)t ν 1 dt is the upper incomplete gamma function. x Finally we get the following result. Proposition 6.1. We have for all q >, P( x qr(s)) p exp(p 1) (p 1) p Γ(p, q(p 1)) := P (q, p). (14) Using the following estimate [? ] x a 1 exp( x) < Γ(a, x) < Bx a 1 exp( x), a > 1, B > 1, x > B (a 1), B 1 we get for q > 1, Therefore the quantity Γ(p, q(p 1)) q p 1 (p 1) p 1 exp( q(p 1)). P (q, p) pqp 1 exp{ (q 1)(p 1)}. (p 1) x balance sheet. If x is drawn from the density c, alors B r(θ) (, q) with a probability at least equal to 1 P (q, p). In the figure(1) we plot for p =, n = 1, A = ( 1 1 ) and y = the density c(r, s)dr for a fixed value of s. We notice that the mode c(r, s) =.6 is very close to the value of r( s) =.69 (13) for the same fixed s. 6.. The general case We take the vector l lasso. We will study the concentration of c around l. The variable of interest is u = x l. The law of u has for density c(u + l)du = 1 Z p exp{ f(u + l, A, y)}du. The change of variables formula gives for each norm c(u + l)du = 1 Z p exp{ f(rθ + l)}r p 1 drdθ, r >, θ S. 7

8 Figure 1: pour r [.1; 1] c(r, s). By definition for any vector x, the convex function r f(rs+l) reaches its minimum at the point r =. Therefore r f(rs + l) is increasing. The function f(rs + l, p) := f(rs + l) (p 1) ln(r), r >, (15) is strictly convex. Its critical point r l (s) is solution of the equation r f(rs + l) = p 1. r By a similar proof to that of propostion (6.1) we have the following result; Proposition 6.. If x is drawn from the density c, and s = x l, then for x l all q >, P( x l qr l (s)) 7. Applications p exp(p 1) (p 1) p Γ(p, q(p 1)) := P (q, p). (16) 7.1. The contour in the case p =, n = 1 Let A := (a 1, a ) a matrix of order 1. Its null-space Ker(A) = {(x 1, x ) : a 1 x 1 + a x = }. We have that B(a 1, a, y) contains Ker(A) B,1. 8

9 This intersection is a symmetric segment noted [(x 1 (a 1, a ), x (a 1, a )), (x 1 (a 1, a ), x (a 1, a ))]. To determine the other points of the set B(a 1, a, y), we will directly calculate Z (s). A simple calculation gives et As Z (s) = exp( y + ω lasso ) exp{ ( As r + ω lasso) Finally we have the following proposition. Proposition ) If As, then exp{ ( As r + ω lasso) }rdr, }rdr + ω lasso exp{ ( As r + ω lasso) }dr = 1. Z (s) = exp( y + ω lasso ) As 1 {1 ω lasso π(1 F (ωlasso ))}, Aω where F is the distibution function of the normal law. ) If As and y =, then Z (s) = ω lasso exp( ω lasso ){1 ω lasso π(1 F (ωlasso ))}. 3) Ifi s S 1,1, As and y =, then the function z z (b ) = 1 b exp( 1 b ){1 1 b π(1 F ( 1 b ))} defined on (, λ 1,] is convex and decreasing, where λ 1, = max s S1,1 As. 4) We have for s S 1,1 the ball Z (s) = z ( 1 ω lasso ), ω S 1,1. B(A, ) = {rs : Z (s) r }. is contained in the unit disk x 1 1 for the norm l 1. The contour is defined by the equation Z (s) = r. 9

10 The norm of the linear operator A : (R, 1 ) (R, ) is defined by λ 1, = sup As. s: s 1 =1 the function s Z (s) = z (λ 1,) is constant on est constante sur If A = (1, 1) then Ω 1, = {s : s 1 = 1, As = λ 1, }. Ω 1, = {s : s 1 = 1, As = λ 1, } If A = (a 1, a ) with a 1 < a, then In both case = [(1, ), (, 1)] [( 1, ), (, 1)]. Ω 1, = {s : s 1 = 1, As = λ 1, } = {(, sgn(a )), (, sgn(a ))}. {z (λ 1,)} 1 Ω1, is part of the contour. The other points of the contour are deduced from the equation z (b ) = a, b (, λ 1, ). Each pair (a, b) generate four points of B((a 1, a ), ) of the form as where s 1 + s = 1, a 1 s 1 + a s = b. We plot in the figure the contour of B(a 1, a, ) for different choices of the matrix (a 1, a ). We notice that the surface of B((a 1, a ), ) is decreasing function relatively the norm λ 1, of the matrix A. Remark 7.. The numerics show thatz(ω lasso ) exploses for the large values of ω lasso, it means that ω is closes to the null-space of A. to eleminate that explosion we need to estimate the tail of the gaussian density. Using the Gordon estimation [1] exp( x ) x + 1 π(1 F (x)) x we have the following approximation exp( x x ), x >, 1 b 1 b z (b ) 1 b 1, near to. (17) 1 + b3 1

11 Figure : Contours of B(a 1, a, ) for different matrices A, p = and n = MCMC diagnosis Here we take p = 7, n = 4, A B(± 1 n ) and for simplicity we consider y =. We sample from the distribution c (1) using Hastings-Metropolis algorithm (x (t) ) and propose the test x (t) qr(θ (t) ) as a criterion for the convergence. Here θ (t) := x(t) x (t). We recall that if x is drawn from the target distribution c, then x qr(θ) with the probability at least equal to P (q, p). Table gives the values of the probability P (q, p). Note that for q.5 the criterion x (t) qr(θ (t) ) is satisfied with a large probability. q P (q, p) Table 1: Values of the probability P (q, p) for p = Independent sampler (IS) The proposal distribution Q(x, x 1 ) = p(x ) = 1 p exp( x 1 ), x 1, x. 11

12 The ratio c(x) p(x) p Z, x. It s known that MCMC (x (t) ) with the target distribution c and the proposal distribution p is uniformly ergodic [13]: sup P(x (t) A x () ) c(x)dx (1 Z A B(R p ) p )t. Here Z.14 and then (1 Z p ) =.987. Figure 4(a) shows respectively the plot of t 5r(θ (t) ) and t x (t). 8.. Random-walk (RW) Metropolis algorithm We do not know if the target distribution c satisfies the curvature condition in [17] Section 6. Here we propose to analyse the convergence of the Random walk Metropolis algorithm (x (t) ) using the criterion x (t) qr(θ (t) ). Figure 4(b) shows respectively the plot of t 5r(θ (t) ) and t x (t). Figures 4 show that contrary to independent sampler algorithm, the random walk (RW) algorithm satisfies early the criterion x (t) 5r(θ). More precisely A 1) the independent sampler (IS) algorithm begins to satisfy the criterion x (t) 5r(θ (t) ) at t = iteration. ) The RW algorithm begins to satisfy the criterion x (t) 3.5r(θ (t) ) at t = iteration, but the IS algorithm never satisfies the criterion x (t) 3.5r(θ (t) ). We finally compare IS and RW algorithms using the fact that xc(x)dx = R p. The best algorithm will furnish the best approximation of the integral xc(x)dx. Table 3 gives the estimators 1 N R p N t=1 x(t) IS xc(x)dx and R p 1 N N t=1 x(t) RW xc(x)dx. It follows that 1 N R p N t=1 x(t) IS =.187 and 1 N N t=1 x(t) RW =.41. We conclude that the random walk algorithm wins for both criteria against independent sampler algorithm. 1

13 x 1 x x 3 x 4 x 5 x 6 x 7 ˆx IS ˆx RW Table : IS and RW estimators using N = 1 6 iterations. Figure 3: (a): Test of convergence of MCMC algorithm with proposal distribution p(x ). (b): Test of convergence of MCMC algorithm with N (,.5I p ) proposal distribution. N = 1 6 iterations. 9. Conclusion We studied the geometry of bayesian LASSO using polar coordinates and calculated the partition function. We obtained a concentration inequality and derived MCMC convergence diagnosis for the convergence of Hasting Metropolis algorithm. We showed that the random walk MCMC with the variance.5 wins again the independent sampler with Laplace proposal distribution. 13

14 References [1] K. Ball, Logarithmically concave functions and sections of convex sets in R n, Stud. Math. T. LXXXVIII (1988) [] A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problem, SIAM J. Imag. Sci. (9) [3] E.T. Copson, Asymptotic Expansions, Cambridge University Press, [4] S. Chen, D. L. Donoho, M. Saunders, Atomic decomposition by basis pursuit, SIAM J. Sci. Comput. (1) (1998) [5] I. Daubechies, M. Defrise, C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Comm. Pure Appl. Math. 57 (11) (4) [6] A. Dermoune, D. Ounaissi, N. Rahmania, Oscillation of Metropolis- Hastings and simulated annealing algorithms around LASSO estimator, Math. Comput. Simulation (15). [7] A. Erdlyi, Higher transcendental functions, California institute of technology, bateman manuscript project, vol. p. 119 (1953). [8] B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, Least angle regression, Ann. Statist. 37 (4) [9] G. Fort, S. Le Corff, E. Moulines, A. Schreck, A shrinkage-thresholding Metropolis adjusted Langevin algorithm for bayesian variable selection, arxiv: v3 [math.st] (15). [1] R.D. Gordon, Values of Mills ratio of area to bounding ordinate of the normal probability integral for large values of the argument, Annals of Mathematical Statistics, vol (1941). [11] B. Klartag, V.D. Milman, Geometry of log-concave functions and measures, Geom. Dedicata 11 (1) (5) [1] M. Mendoza, A. Allegra, T.P. Coleman, Bayesian Lasso Posterior Sampling Via Parallelized Measure Transport, arxiv:181.16v1 [stat.co] (18). 14

15 [13] K. Mengersen, R.L Tweedie, Rates of convergence of the Hastings and Metropolis algorithms, Ann. Statist. 4 (1994) [14] N. Parikh, S. Boyed, Proximal algorithms, Found. Trends Optim. 1 (3) (3) [15] M. Pereyra, Proximal Markov chain Monte Carlo algorithms, Stat. Comput. (15) [16] N.M. Temme, Numerical and asymptotic aspects of parabolic cylinder functions, ournal of Computational and Applied Mathematics (). [17] G.O. Roberts, A.L. Tweedie, Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms, Biometrika 83 (1) (1996) [18] R. Tibshirani, Regression shrinkage and selection via Lasso, J. R. Stat. Soc. Ser. B Stat. Methodol. 58 (1) (1996) [19] R. Tibshirani, The Lasso problem and uniqueness, Electron. J. Stat. 7 (13)

arxiv: v1 [math.st] 4 Dec 2015

arxiv: v1 [math.st] 4 Dec 2015 MCMC convergence diagnosis using geometry of Bayesian LASSO A. Dermoune, D.Ounaissi, N.Rahmania Abstract arxiv:151.01366v1 [math.st] 4 Dec 015 Using posterior distribution of Bayesian LASSO we construct

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

Geometric ergodicity of the Bayesian lasso

Geometric ergodicity of the Bayesian lasso Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components

More information

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS

ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS The Annals of Applied Probability 1998, Vol. 8, No. 4, 1291 1302 ON CONVERGENCE RATES OF GIBBS SAMPLERS FOR UNIFORM DISTRIBUTIONS By Gareth O. Roberts 1 and Jeffrey S. Rosenthal 2 University of Cambridge

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

University of Toronto Department of Statistics

University of Toronto Department of Statistics Norm Comparisons for Data Augmentation by James P. Hobert Department of Statistics University of Florida and Jeffrey S. Rosenthal Department of Statistics University of Toronto Technical Report No. 0704

More information

Quantitative Non-Geometric Convergence Bounds for Independence Samplers

Quantitative Non-Geometric Convergence Bounds for Independence Samplers Quantitative Non-Geometric Convergence Bounds for Independence Samplers by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 28; revised July 29.) 1. Introduction. Markov chain Monte Carlo (MCMC)

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

Perturbed Proximal Gradient Algorithm

Perturbed Proximal Gradient Algorithm Perturbed Proximal Gradient Algorithm Gersende FORT LTCI, CNRS, Telecom ParisTech Université Paris-Saclay, 75013, Paris, France Large-scale inverse problems and optimization Applications to image processing

More information

Inverse Brascamp-Lieb inequalities along the Heat equation

Inverse Brascamp-Lieb inequalities along the Heat equation Inverse Brascamp-Lieb inequalities along the Heat equation Franck Barthe and Dario Cordero-Erausquin October 8, 003 Abstract Adapting Borell s proof of Ehrhard s inequality for general sets, we provide

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo

Winter 2019 Math 106 Topics in Applied Mathematics. Lecture 9: Markov Chain Monte Carlo Winter 2019 Math 106 Topics in Applied Mathematics Data-driven Uncertainty Quantification Yoonsang Lee (yoonsang.lee@dartmouth.edu) Lecture 9: Markov Chain Monte Carlo 9.1 Markov Chain A Markov Chain Monte

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

arxiv: v2 [math.st] 12 Feb 2008

arxiv: v2 [math.st] 12 Feb 2008 arxiv:080.460v2 [math.st] 2 Feb 2008 Electronic Journal of Statistics Vol. 2 2008 90 02 ISSN: 935-7524 DOI: 0.24/08-EJS77 Sup-norm convergence rate and sign concentration property of Lasso and Dantzig

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

A Geometric Interpretation of the Metropolis Hastings Algorithm

A Geometric Interpretation of the Metropolis Hastings Algorithm Statistical Science 2, Vol. 6, No., 5 9 A Geometric Interpretation of the Metropolis Hastings Algorithm Louis J. Billera and Persi Diaconis Abstract. The Metropolis Hastings algorithm transforms a given

More information

Advances and Applications in Perfect Sampling

Advances and Applications in Perfect Sampling and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC

More information

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1 Recent progress on computable bounds and the simple slice sampler by Gareth O. Roberts* and Jerey S. Rosenthal** (May, 1999.) This paper discusses general quantitative bounds on the convergence rates of

More information

6 Markov Chain Monte Carlo (MCMC)

6 Markov Chain Monte Carlo (MCMC) 6 Markov Chain Monte Carlo (MCMC) The underlying idea in MCMC is to replace the iid samples of basic MC methods, with dependent samples from an ergodic Markov chain, whose limiting (stationary) distribution

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Least Sparsity of p-norm based Optimization Problems with p > 1

Least Sparsity of p-norm based Optimization Problems with p > 1 Least Sparsity of p-norm based Optimization Problems with p > Jinglai Shen and Seyedahmad Mousavi Original version: July, 07; Revision: February, 08 Abstract Motivated by l p -optimization arising from

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

Variance Bounding Markov Chains

Variance Bounding Markov Chains Variance Bounding Markov Chains by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September 2006; revised April 2007.) Abstract. We introduce a new property of Markov chains, called variance bounding.

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

DNNs for Sparse Coding and Dictionary Learning

DNNs for Sparse Coding and Dictionary Learning DNNs for Sparse Coding and Dictionary Learning Subhadip Mukherjee, Debabrata Mahapatra, and Chandra Sekhar Seelamantula Department of Electrical Engineering, Indian Institute of Science, Bangalore 5612,

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Some Results on the Ergodicity of Adaptive MCMC Algorithms

Some Results on the Ergodicity of Adaptive MCMC Algorithms Some Results on the Ergodicity of Adaptive MCMC Algorithms Omar Khalil Supervisor: Jeffrey Rosenthal September 2, 2011 1 Contents 1 Andrieu-Moulines 4 2 Roberts-Rosenthal 7 3 Atchadé and Fort 8 4 Relationship

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Nonlinear Programming Models

Nonlinear Programming Models Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

An Homotopy Algorithm for the Lasso with Online Observations

An Homotopy Algorithm for the Lasso with Online Observations An Homotopy Algorithm for the Lasso with Online Observations Pierre J. Garrigues Department of EECS Redwood Center for Theoretical Neuroscience University of California Berkeley, CA 94720 garrigue@eecs.berkeley.edu

More information

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina INDUSTRIAL MATHEMATICS INSTITUTE 2007:08 A remark on compressed sensing B.S. Kashin and V.N. Temlyakov IMI Preprint Series Department of Mathematics University of South Carolina A remark on compressed

More information

Dimensional behaviour of entropy and information

Dimensional behaviour of entropy and information Dimensional behaviour of entropy and information Sergey Bobkov and Mokshay Madiman Note: A slightly condensed version of this paper is in press and will appear in the Comptes Rendus de l Académies des

More information

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J. Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Markov Chain Monte Carlo Methods for Stochastic Optimization

Markov Chain Monte Carlo Methods for Stochastic Optimization Markov Chain Monte Carlo Methods for Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U of Toronto, MIE,

More information

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 10 COS53: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 0 MELISSA CARROLL, LINJIE LUO. BIAS-VARIANCE TRADE-OFF (CONTINUED FROM LAST LECTURE) If V = (X n, Y n )} are observed data, the linear regression problem

More information

THE INVERSE FUNCTION THEOREM

THE INVERSE FUNCTION THEOREM THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)

More information

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.

More information

Particle Filtering Approaches for Dynamic Stochastic Optimization

Particle Filtering Approaches for Dynamic Stochastic Optimization Particle Filtering Approaches for Dynamic Stochastic Optimization John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge I-Sim Workshop,

More information

arxiv: v1 [stat.co] 18 Feb 2012

arxiv: v1 [stat.co] 18 Feb 2012 A LEVEL-SET HIT-AND-RUN SAMPLER FOR QUASI-CONCAVE DISTRIBUTIONS Dean Foster and Shane T. Jensen arxiv:1202.4094v1 [stat.co] 18 Feb 2012 Department of Statistics The Wharton School University of Pennsylvania

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

On John type ellipsoids

On John type ellipsoids On John type ellipsoids B. Klartag Tel Aviv University Abstract Given an arbitrary convex symmetric body K R n, we construct a natural and non-trivial continuous map u K which associates ellipsoids to

More information

Scale Mixture Modeling of Priors for Sparse Signal Recovery

Scale Mixture Modeling of Priors for Sparse Signal Recovery Scale Mixture Modeling of Priors for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Outline Outline Sparse

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

About Split Proximal Algorithms for the Q-Lasso

About Split Proximal Algorithms for the Q-Lasso Thai Journal of Mathematics Volume 5 (207) Number : 7 http://thaijmath.in.cmu.ac.th ISSN 686-0209 About Split Proximal Algorithms for the Q-Lasso Abdellatif Moudafi Aix Marseille Université, CNRS-L.S.I.S

More information

Langevin and hessian with fisher approximation stochastic sampling for parameter estimation of structured covariance

Langevin and hessian with fisher approximation stochastic sampling for parameter estimation of structured covariance Langevin and hessian with fisher approximation stochastic sampling for parameter estimation of structured covariance Cornelia Vacar, Jean-François Giovannelli, Yannick Berthoumieu To cite this version:

More information

Basis Pursuit Denoising and the Dantzig Selector

Basis Pursuit Denoising and the Dantzig Selector BPDN and DS p. 1/16 Basis Pursuit Denoising and the Dantzig Selector West Coast Optimization Meeting University of Washington Seattle, WA, April 28 29, 2007 Michael Friedlander and Michael Saunders Dept

More information

NEWTON POLYGONS AND FAMILIES OF POLYNOMIALS

NEWTON POLYGONS AND FAMILIES OF POLYNOMIALS NEWTON POLYGONS AND FAMILIES OF POLYNOMIALS ARNAUD BODIN Abstract. We consider a continuous family (f s ), s [0,1] of complex polynomials in two variables with isolated singularities, that are Newton non-degenerate.

More information

arxiv: v1 [stat.co] 2 Nov 2017

arxiv: v1 [stat.co] 2 Nov 2017 Binary Bouncy Particle Sampler arxiv:1711.922v1 [stat.co] 2 Nov 217 Ari Pakman Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University

More information

Sequential Monte Carlo Methods in High Dimensions

Sequential Monte Carlo Methods in High Dimensions Sequential Monte Carlo Methods in High Dimensions Alexandros Beskos Statistical Science, UCL Oxford, 24th September 2012 Joint work with: Dan Crisan, Ajay Jasra, Nik Kantas, Andrew Stuart Imperial College,

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

A Dirichlet Form approach to MCMC Optimal Scaling

A Dirichlet Form approach to MCMC Optimal Scaling A Dirichlet Form approach to MCMC Optimal Scaling Giacomo Zanella, Wilfrid S. Kendall, and Mylène Bédard. g.zanella@warwick.ac.uk, w.s.kendall@warwick.ac.uk, mylene.bedard@umontreal.ca Supported by EPSRC

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Markov Chain Monte Carlo Methods for Stochastic

Markov Chain Monte Carlo Methods for Stochastic Markov Chain Monte Carlo Methods for Stochastic Optimization i John R. Birge The University of Chicago Booth School of Business Joint work with Nicholas Polson, Chicago Booth. JRBirge U Florida, Nov 2013

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures

Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures Spectral Gap and Concentration for Some Spherically Symmetric Probability Measures S.G. Bobkov School of Mathematics, University of Minnesota, 127 Vincent Hall, 26 Church St. S.E., Minneapolis, MN 55455,

More information

Generating Random Variables

Generating Random Variables Generating Random Variables Christian Robert Université Paris Dauphine and CREST, INSEE George Casella University of Florida Keywords and Phrases: Random Number Generator, Probability Integral Transform,

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Weak convergence of Markov chain Monte Carlo II

Weak convergence of Markov chain Monte Carlo II Weak convergence of Markov chain Monte Carlo II KAMATANI, Kengo Mar 2011 at Le Mans Background Markov chain Monte Carlo (MCMC) method is widely used in Statistical Science. It is easy to use, but difficult

More information

Point spread function reconstruction from the image of a sharp edge

Point spread function reconstruction from the image of a sharp edge DOE/NV/5946--49 Point spread function reconstruction from the image of a sharp edge John Bardsley, Kevin Joyce, Aaron Luttman The University of Montana National Security Technologies LLC Montana Uncertainty

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

In English, this means that if we travel on a straight line between any two points in C, then we never leave C. Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic

The Polya-Gamma Gibbs Sampler for Bayesian. Logistic Regression is Uniformly Ergodic he Polya-Gamma Gibbs Sampler for Bayesian Logistic Regression is Uniformly Ergodic Hee Min Choi and James P. Hobert Department of Statistics University of Florida August 013 Abstract One of the most widely

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT Elect. Comm. in Probab. 10 (2005), 36 44 ELECTRONIC COMMUNICATIONS in PROBABILITY ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT FIRAS RASSOUL AGHA Department

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

Markov Chains and MCMC

Markov Chains and MCMC Markov Chains and MCMC CompSci 590.02 Instructor: AshwinMachanavajjhala Lecture 4 : 590.02 Spring 13 1 Recap: Monte Carlo Method If U is a universe of items, and G is a subset satisfying some property,

More information

Least squares under convex constraint

Least squares under convex constraint Stanford University Questions Let Z be an n-dimensional standard Gaussian random vector. Let µ be a point in R n and let Y = Z + µ. We are interested in estimating µ from the data vector Y, under the assumption

More information

On matrix variate Dirichlet vectors

On matrix variate Dirichlet vectors On matrix variate Dirichlet vectors Konstencia Bobecka Polytechnica, Warsaw, Poland Richard Emilion MAPMO, Université d Orléans, 45100 Orléans, France Jacek Wesolowski Polytechnica, Warsaw, Poland Abstract

More information

Examples of Adaptive MCMC

Examples of Adaptive MCMC Examples of Adaptive MCMC by Gareth O. Roberts * and Jeffrey S. Rosenthal ** (September, 2006.) Abstract. We investigate the use of adaptive MCMC algorithms to automatically tune the Markov chain parameters

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Anomalous transport of particles in Plasma physics

Anomalous transport of particles in Plasma physics Anomalous transport of particles in Plasma physics L. Cesbron a, A. Mellet b,1, K. Trivisa b, a École Normale Supérieure de Cachan Campus de Ker Lann 35170 Bruz rance. b Department of Mathematics, University

More information

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal*

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal* Analysis of the Gibbs sampler for a model related to James-Stein estimators by Jeffrey S. Rosenthal* Department of Statistics University of Toronto Toronto, Ontario Canada M5S 1A1 Phone: 416 978-4594.

More information

The two subset recurrent property of Markov chains

The two subset recurrent property of Markov chains The two subset recurrent property of Markov chains Lars Holden, Norsk Regnesentral Abstract This paper proposes a new type of recurrence where we divide the Markov chains into intervals that start when

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Bayesian Analysis of Massive Datasets Via Particle Filters

Bayesian Analysis of Massive Datasets Via Particle Filters Bayesian Analysis of Massive Datasets Via Particle Filters Bayesian Analysis Use Bayes theorem to learn about model parameters from data Examples: Clustered data: hospitals, schools Spatial models: public

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Pure Mathematics Paper II

Pure Mathematics Paper II MATHEMATICS TUTORIALS H AL TARXIEN A Level 3 hours Pure Mathematics Question Paper This paper consists of five pages and ten questions. Check to see if any pages are missing. Answer any SEVEN questions.

More information

Lecture 1: September 25

Lecture 1: September 25 0-725: Optimization Fall 202 Lecture : September 25 Lecturer: Geoff Gordon/Ryan Tibshirani Scribes: Subhodeep Moitra Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information