Bayesian Sparse Linear Regression with Unknown Symmetric Error

Size: px
Start display at page:

Download "Bayesian Sparse Linear Regression with Unknown Symmetric Error"

Transcription

1 Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of Statistics and Data Sciences, The University of Texas at Austin 3 Department of Statistical Science, Duke University June 17, 2016 BU-KEIO 2016 Workshop 1 / 38

2 2 / 38 Outline 1 Introduction 2 Sparse linear model 3 Linear model with unknwon error distribution 4 Asymptotic results

3 3 / 38 Outline 1 Introduction 2 Sparse linear model 3 Linear model with unknwon error distribution 4 Asymptotic results

4 4 / 38 Symmetric location problem Y i = µ + ɛ i, ɛ i iid η( ) (unknown) If η is symmetric, efficient and adaptive estimation of µ is possible. [Beran, 1974; Stone 1975;...] Linear regression [Bickel, 1982]: µ = x T i θ, θ R p, i = 1,..., n. For Bayesian, the semi-parametric Bernstein-von Mises (BvM) theorem holds. [Chae, Kim and Kleijn, 2016] We study a Bayesian approach when p is large.

5 5 / 38 Bayesian paradigm A parameter θ is generated acoording to a prior distribution Π. Conditional on θ, the data X is generated according to a density p θ. For given observed data X, statistical inferences are based on the posterior distribution: dπ(θ X) p θ (X)dΠ(θ). Typically, the posterior distribution can be approximated via MCMC.

6 6 / 38 Bayesian asymptotics A frequentist would like to know their performance in a frequentist viewpoint. Assume that the data X 1,..., X n is generated according to a given parameter θ 0 and consider the posterior Π(θ X 1,..., X n ). For large enough n, we want Π(θ X 1,..., X n ) to put most of its mass near θ 0 for most X 1,..., X n. For infinite dimensional θ, the choice of the prior is important.

7 7 / 38 Parametric Bernstein-von Mises theorem Assume that a parametric model P = {P θ : θ Θ} is regular and iid X 1,..., X n P θ0, where θ 0 Θ. THEOREM (Bernstein-von Mises) [Le Cam and Yang, 1990] For any prior with positive density around θ 0, Π( X 1,..., X n ) N (ˆθn, I 1 θ 0 /n ) P TV 0, where ˆθ n is an efficient estimator for θ and I θ0 is the Fisher information matrix. The Bayesian credible interval is a standard confidence interval.

8 8 / 38 Parametric BvM: Illustration θ Beta(5, 1), X 1,..., X n θ iid Bernoulli(θ), θ 0 = 1/2 density n=10 n=50 n=

9 9 / 38 Bayesian asymptotics A frequentist would like to know their performance in a frequentist viewpoint. Assume that the data X 1,..., X n is generated according to a given parameter θ 0 and consider the posterior Π(θ X 1,..., X n ). For large enough n, we want Π(θ X 1,..., X n ) to put most of its mass near θ 0 for most X 1,..., X n. For infinite dimensional θ, the choice of the prior is important.

10 10 / 38 Semi-parametric BvM (fixed p) Y i = x T i θ + ɛ i, ɛ i iid η( ) (unknown) Put a symmetrized Dirichlet process (DP) mixture prior on η. THEOREM [Chae, Kim and Kleijn, 2016] For any prior on θ, with positive density around θ 0, Π(θ X 1,..., X n ) N (ˆθn, I 1 θ 0,η 0 /n ) P TV 0, where ˆθ n is an efficient estimator for θ and I θ0,η 0 is the efficient information matrix. What if p is large?

11 11 / 38 Outline 1 Introduction 2 Sparse linear model 3 Linear model with unknwon error distribution 4 Asymptotic results

12 12 / 38 Sparse linear model Consider the linear regression model Y i = x T i θ + ɛ i, i = 1,..., n where θ = (θ 1,..., θ p ) T and possibly p n. Simply, Y = Xθ + ɛ. A sparse model assumes that most of θ i s are (nearly) zero. We apply full Bayesian procedures, and express the sparsity in priors.

13 Sparse prior A prior Π Θ for θ R p can be constructed as follows: 1 (Dimension) Choose s from prior π p on {0, 1,..., p}. 2 (Model) Choose S {0, 1,..., p} of size S = s at random. 3 (Nonzero coeff.) Choose θ S = (θ i ) i S from density g S on R S and set θ S c = 0. Formally, (S, θ) π p (s) 1 ( p s)g S (θ S )δ 0 (θ S c). Prior π p on the dimension controls the level of sparsity. 13 / 38

14 14 / 38 Sparse prior: example Spike and slab [Ishwaran and Rao 2005; and many authors] for some r (0, 1), similarly, s Binomial(p, r) θ i (1 r)δ 0 + rg, i p for some continuous distribution G. Good asymptotic properties if r Beta(1, p u ) for some u > 1 and tail of G is as thick as Laplace. [Castillo and van der Vaart, 2015]

15 15 / 38 Sparse prior: example Complexity prior [Castillo and van der Vaart, 2012] π p (s) c s p as, s = 0, 1,..., p for some constants a, c > 0. Roughly, π p (s) ( ) p 1, for s p. s

16 16 / 38 Other priors Continuous shrinkage priors that peaks near zero. Typically, scale mixtures of normals: for i = 1,..., p, θ i τ 2, λ 2 i N(0, τ 2 λ 2 i ), λ 2 i π λ (λ 2 i ),, τ 2 π τ (τ 2 ). 1 Bayesian Lasso [Park and Casella, 2008] 2 Horseshoe [Carvalho, Polson and Scott, 2010] 3 Normal-gamma [Griffin and Brown, 2010] 4 Generalized double Pareto [Amagan, Dunson and Lee, 2013] 5 Dirichlet-Laplace [Bhattacharya et al., 2016] 6...

17 17 / 38 Outline 1 Introduction 2 Sparse linear model 3 Linear model with unknwon error distribution 4 Asymptotic results

18 18 / 38 Gaussian model Y i = x T i θ + ɛ i, i = 1,..., n. Assume that ɛ i i.i.d. η for some density η H. Usually it is assumed that η(y) = φ σ (y) because of 1 computational simplicity, and 2 good theoretical properties. Some properties (e.g. consistency and rate) tend to be robust to misspecification.

19 19 / 38 Key problems Y i = x T i θ + ɛ i, i = 1,..., n. Assume that ɛ i s are not really normally distributed. Key problems caused from model misspecification: 1 (Efficiency) Asymptotic variance of n(ˆθ i θ i ) can be large. 2 (Uncertainty quantification) Credible sets do not give valid confidence. [Kleijn and van der Vaart, 2012] 3 (Selection) Misspecification might result in serious overfitting. [Grünwald and Ommen, 2014] Good remedy : semi-parametric modelling.

20 20 / 38 Key problems: example [Grünwald and Ommen, 2014] Y i = θ int + θ 1 x i + θ 2 x 2 i + + θ p x p i + ɛ i, θ 0 = 0 R p+1

21 21 / 38 Key problems Y i = x T i θ + ɛ i, i = 1,..., n. Assume that ɛ i s are not really normally distributed. Key problems caused from model misspecification: 1 (Efficiency) Asymptotic variance of n(ˆθ i θ i ) can be large. 2 (Uncertainty quantification) Credible sets do not give valid confidence. [Kleijn and van der Vaart, 2012] 3 (Selection) Misspecification might result in serious overfitting. [Grünwald and Ommen, 2014] Good remedy : semi-parametric modelling.

22 22 / 38 Frequentist s method for fixed p Y i = x T i θ + ɛ i, ɛ i η. There is an efficient estimator for θ. [Bickel, 1982] One way to get an efficient estimator is: 1 Find an initial n 1/2 -consistent estimator θ n. 2 Estimate the score function with perturbed sample ɛ i = Y i θ T n X i. 3 Solve the score equation using one step Newton-Raphson iteration. Does it work if p n?

23 23 / 38 Bayesian method for fixed p Y i = xi T θ + ɛ i, ɛ i η. Put a symmetrized DP mixture prior Π H on η: η(y) = φ σ (y z)df(z, σ), F DP(α), and df(z, σ) = df(z, σ) + df( z, σ). 2 Then, the BvM theorem holds. [Chae, Kim and Kleijn, 2016] Inference: Gibbs sampler algorithm

24 24 / 38 Bayesian inference Y i = x T i θ + ɛ i Y i = x T i θ + z i + σ i ɛ i ɛ i η (z i, σ i ) F, ɛ i N(0, 1) Inference can be done through Gibbs sampler algorithm: 1 For given (z i, σ i ) i n, θ can be sampled as in the Gaussian model. 2 For given θ, (z i, σ i ) i n can be sampled as in the DPM model. Additional computational burden by semi-parametric modelling depends only on n. Feasible when p n!

25 25 / 38 Outline 1 Introduction 2 Sparse linear model 3 Linear model with unknwon error distribution 4 Asymptotic results

26 Goal: frequentist properties (p n) Assume fixed design X, and responce vector Y is really generated from a given (θ 0, η 0 ), possibly p n. We want (marginal) posterior Π(θ Y): 1 (Recovery) to put most of its mass around θ 0 2 (Uncertainty quantification) to express remaining uncertainty 3 (Selection) to find the true nonzero set S 0 of θ 0 4 (Adaptation) to adapt unknwon sparsity level and error density with high P θ0,η 0 -probability. 26 / 38

27 27 / 38 Prior for θ The probability π p (s) decrease exponentially: [Castillo and van der Vaart, 2012; 2015] (i) for some constants A 1, A 2, A 3, A 4 > 0, A 1 p A 3 π p (s 1) π p (s) A 2 p A 4 π p (s 1), s = 1,..., p Tails of nonzero coeff. are as thick as Laplace distribution: [Castillo and van der Vaart, 2012; van der Pas et al., 2016] (ii) g S (θ) = i S g(θ i ), g(θ i ) e λ θ i and λ satisfies n p λ n log p.

28 28 / 38 Prior for η Put a symmetrized DP mixture prior Π H on η [Chae, Kim and Kleijn, 2016] : η(y) = φ σ (y z)df(z, σ), F DP(α), and df(z, σ) = df(z, σ) + df( z, σ). 2 Assume that supp(α) [ M, M] [σ 1, σ 2 ] for some positive constants M and σ 1 < σ 2.

29 29 / 38 Design matrix Assume unifomrly bounded covariates: x ij 1. Define uniform compatibility numbers and restricted eigenvalues φ 2 (s) = inf { sθ Xθ 2 2 n θ 2 1 ψ 2 (s) = inf { Xθ 2 2 n θ 2 2 } : 0 < s θ s } : 0 < s θ s. φ(ks 0 ) 1 (ψ(ks 0 ) 1, resp.) for some const. K > 1 is sufficient for the recovery of θ in l 1 - (l 2 -, resp.) norm.

30 30 / 38 Design matrix: examples By C-S inequality, φ(s) ψ(s). ψ(s) 1 in many examples: 1 Typically, ψ(s) const. s max i j corr(x i, x j ). [Lounici, 2008] 2 If x ij s are i.i.d. random variables, then ψ(s) 1 with high probability for s n/ log p. [Cai and Jiang, 2011] 3 If p = n and corr(x i, x j ) = ρ i j for some ρ (0, 1), then ψ(p) 1. [Zhao and Yu, 2006] There are some examples such that φ(s) 1 but not for ψ(s). [van de Geer and Bühlmann, 2009]

31 31 / 38 Asymptotic: dimension THEOREM [Chae, Lin and Dunson, 2016] If λ θ 0 1 s 0 log p and s 0 log p n, then EΠ ( s θ > Ks 0 Y ) 0 for some constant K > 1. Small value of λ is preferred for large θ 0 1.

32 Asymptotic: consistency d 2 n((θ, η), (θ 0, η 0 )) = 1 n n dh(p 2 θ,η,i, p θ0,η 0,i). i=1 Mean Hellinger distance d n allows to construct certain exponentially consistent tests for independent observations. [Birgé, 1983; Ghosal and van der Vaart 2007] THEOREM [Chae, Lin and Dunson, 2016] If, furthermore, φ(ks 0 ) p 1, then ( s0 log p EΠ d n ((θ, η), (θ 0, η 0 )) n ) Y / 38

33 33 / 38 Asymptotic: consistency (cont.) THEOREM [Chae, Lin and Dunson, 2016] Under the previous conditions, ( s0 log p ) EΠ d H (η, η 0 ) Y 0. n If, furthermore, s 2 0 log p/φ2 (Ks 0 ) n, then ( EΠ θ θ 0 1 s 0 log p ) Y 0 φ(ks 0 ) n ( 1 s0 log p ) EΠ θ θ 0 2 Y 0 ψ(ks 0 ) n ( EΠ X(θ θ 0 ) 2 ) s 0 log p Y 0.

34 34 / 38 Asymptotic: LAN r n (θ, η) = L n (θ, η) L n (θ 0, η) { n(θ θ0 ) T G n lθ0,η 0 n } 2 (θ θ 0) T V n,η0 (θ θ 0 ) THEOREM [Chae, Lin and Dunson, 2016] If s 0 log p n 1/6, then sup sup r n (θ, η) = o P (1), η H n θ Θ n where Π(Θ n H n Y) 1 in probability.

35 35 / 38 Asymptotic: BvM theorem Let N n,s be the S -dimensional normal dist n to which an efficient estimator n(ˆθ S θs 0 ) converges in dist n. THEOREM [Chae, Lin and Dunson, 2016] If, furthermore, λs 0 log p n and ψ(ks0 ) 1, then sup Π( n(θs θ 0,S ) B Y, S θ = S) N n,s (B) = o P (1), sup S S n B where Π(S θ S n Y) 1 in probability. Posterior dist n of nonzero coeff. is asymptotically a mixture of normal dist n.

36 36 / 38 Asymptotic: selection THEOREM [Chae, Lin and Dunson, 2016] Under the previous conditions, Π(S θ S 0 Y) 0 in probability. The true non-zero coeff. can be selected if every non-zero coeff. is not very small (beta-min condition).

37 37 / 38 Discussion Condition s 0 log p n 1/6 is required due to semi-parametric bias. If η is known (may not be a Gaussian) and p = s 0, the condition may be reduced to s 0 n 1/3, and this cannot be improved. [Panov and Spokoiny, 2015] In some parametric models, s 0 n 1/6 is required for BvM theorem. [Ghosal, 2000] Results can be extended to more general prior, i.e., M, σ 1 and σ 1 0, but sub-gaussian tail of l η0 is (maybe) essential in selection. [Kim and Jeon, 2016]

38 Selected references [1] Castillo, I., Schmidt-Hieber, J., and van der Vaart, A. W. (2015). Bayesian linear regression with sparse priors. Ann. Statist. [2] Chae, M. (2015). The semiparametric Bernstein von Mises theorem for models with symmetric error. PhD thesis, Seoul National University. arxiv. [3] Chae, M., Kim, Y., and Kleijn, B. J. K. (2016). The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors. arxiv. [4] Chae, M., Lin, L., and Dunson, D. B. (2016). Bayesian sparse linear regression with unknown symmetric error. arxiv. [5] Grünwald, P. and van Ommen, T. (2014). Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it. arxiv. [6] Hanson, D. L. and Wright, F. T. (1971). A bound on tail probabilities for quadratic forms in independent random variables. Ann. Math. Statist. [7] Pollard, D. (2001). Bracketing methods. Unpublished manuscript. Available at pollard/books/asymptopia/bracketing.pdf. [8] van de Geer, S. A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Statist. 38 / 38

Statistica Sinica Preprint No: SS R2

Statistica Sinica Preprint No: SS R2 Statistica Sinica Preprint No: SS-2017-0074.R2 Title The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors Manuscript ID SS-2017-0074.R2 URL http://www.stat.sinica.edu.tw/statistica/

More information

Semiparametric posterior limits

Semiparametric posterior limits Statistics Department, Seoul National University, Korea, 2012 Semiparametric posterior limits for regular and some irregular problems Bas Kleijn, KdV Institute, University of Amsterdam Based on collaborations

More information

Bayesian Statistics in High Dimensions

Bayesian Statistics in High Dimensions Bayesian Statistics in High Dimensions Lecture 2: Sparsity Aad van der Vaart Universiteit Leiden, Netherlands 47th John H. Barrett Memorial Lectures, Knoxville, Tenessee, May 2017 Contents Sparsity Bayesian

More information

Nonparametric Bayesian Uncertainty Quantification

Nonparametric Bayesian Uncertainty Quantification Nonparametric Bayesian Uncertainty Quantification Lecture 1: Introduction to Nonparametric Bayes Aad van der Vaart Universiteit Leiden, Netherlands YES, Eindhoven, January 2017 Contents Introduction Recovery

More information

Bayesian estimation of the discrepancy with misspecified parametric models

Bayesian estimation of the discrepancy with misspecified parametric models Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Partial factor modeling: predictor-dependent shrinkage for linear regression

Partial factor modeling: predictor-dependent shrinkage for linear regression modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework

More information

ICES REPORT Model Misspecification and Plausibility

ICES REPORT Model Misspecification and Plausibility ICES REPORT 14-21 August 2014 Model Misspecification and Plausibility by Kathryn Farrell and J. Tinsley Odena The Institute for Computational Engineering and Sciences The University of Texas at Austin

More information

Bayesian shrinkage approach in variable selection for mixed

Bayesian shrinkage approach in variable selection for mixed Bayesian shrinkage approach in variable selection for mixed effects s GGI Statistics Conference, Florence, 2015 Bayesian Variable Selection June 22-26, 2015 Outline 1 Introduction 2 3 4 Outline Introduction

More information

Lasso & Bayesian Lasso

Lasso & Bayesian Lasso Readings Chapter 15 Christensen Merlise Clyde October 6, 2015 Lasso Tibshirani (JRSS B 1996) proposed estimating coefficients through L 1 constrained least squares Least Absolute Shrinkage and Selection

More information

Horseshoe, Lasso and Related Shrinkage Methods

Horseshoe, Lasso and Related Shrinkage Methods Readings Chapter 15 Christensen Merlise Clyde October 15, 2015 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Bayesian Lasso Park & Casella

More information

Inconsistency of Bayesian inference when the model is wrong, and how to repair it

Inconsistency of Bayesian inference when the model is wrong, and how to repair it Inconsistency of Bayesian inference when the model is wrong, and how to repair it Peter Grünwald Thijs van Ommen Centrum Wiskunde & Informatica, Amsterdam Universiteit Leiden June 3, 2015 Outline 1 Introduction

More information

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage Lingrui Gan, Naveen N. Narisetty, Feng Liang Department of Statistics University of Illinois at Urbana-Champaign Problem Statement

More information

Priors for the frequentist, consistency beyond Schwartz

Priors for the frequentist, consistency beyond Schwartz Victoria University, Wellington, New Zealand, 11 January 2016 Priors for the frequentist, consistency beyond Schwartz Bas Kleijn, KdV Institute for Mathematics Part I Introduction Bayesian and Frequentist

More information

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA)

Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Approximating high-dimensional posteriors with nuisance parameters via integrated rotated Gaussian approximation (IRGA) Willem van den Boom Department of Statistics and Applied Probability National University

More information

Scalable MCMC for the horseshoe prior

Scalable MCMC for the horseshoe prior Scalable MCMC for the horseshoe prior Anirban Bhattacharya Department of Statistics, Texas A&M University Joint work with James Johndrow and Paolo Orenstein September 7, 2018 Cornell Day of Statistics

More information

Bayesian Regularization

Bayesian Regularization Bayesian Regularization Aad van der Vaart Vrije Universiteit Amsterdam International Congress of Mathematicians Hyderabad, August 2010 Contents Introduction Abstract result Gaussian process priors Co-authors

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics Posterior contraction and limiting shape Ismaël Castillo LPMA Université Paris VI Berlin, September 2016 Ismaël Castillo (LPMA Université Paris VI) BNP lecture series Berlin, September

More information

Posterior properties of the support function for set inference

Posterior properties of the support function for set inference Posterior properties of the support function for set inference Yuan Liao University of Maryland Anna Simoni CNRS and CREST November 26, 2014 Abstract Inference on sets is of huge importance in many statistical

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function

Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function Bayesian Regression with Heteroscedastic Error Density and Parametric Mean Function Justinas Pelenis pelenis@ihs.ac.at Institute for Advanced Studies, Vienna May 8, 2013 Abstract This paper considers a

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

The Bayesian approach to inverse problems

The Bayesian approach to inverse problems The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu

More information

Geometric ergodicity of the Bayesian lasso

Geometric ergodicity of the Bayesian lasso Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling

A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling A union of Bayesian, frequentist and fiducial inferences by confidence distribution and artificial data sampling Min-ge Xie Department of Statistics, Rutgers University Workshop on Higher-Order Asymptotics

More information

Bayesian methods in economics and finance

Bayesian methods in economics and finance 1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu 1,2, Daniel F. Schmidt 1, Enes Makalic 1, Guoqi Qian 2, John L. Hopper 1 1 Centre for Epidemiology and Biostatistics,

More information

Nonparametric Bayes Density Estimation and Regression with High Dimensional Data

Nonparametric Bayes Density Estimation and Regression with High Dimensional Data Nonparametric Bayes Density Estimation and Regression with High Dimensional Data Abhishek Bhattacharya, Garritt Page Department of Statistics, Duke University Joint work with Prof. D.Dunson September 2010

More information

Calibrating general posterior credible regions

Calibrating general posterior credible regions Calibrating general posterior credible regions Nicholas Syring and Ryan Martin April 24, 2018 arxiv:1509.00922v4 [stat.me] 22 Apr 2018 Abstract An advantage of methods that base inference on a posterior

More information

arxiv: v3 [stat.me] 10 Aug 2018

arxiv: v3 [stat.me] 10 Aug 2018 Lasso Meets Horseshoe: A Survey arxiv:1706.10179v3 [stat.me] 10 Aug 2018 Anindya Bhadra 250 N. University St., West Lafayette, IN 47907. e-mail: bhadra@purdue.edu Jyotishka Datta 1 University of Arkansas,

More information

Package horseshoe. November 8, 2016

Package horseshoe. November 8, 2016 Title Implementation of the Horseshoe Prior Version 0.1.0 Package horseshoe November 8, 2016 Description Contains functions for applying the horseshoe prior to highdimensional linear regression, yielding

More information

Brittleness and Robustness of Bayesian Inference

Brittleness and Robustness of Bayesian Inference Brittleness and Robustness of Bayesian Inference Tim Sullivan 1 with Houman Owhadi 2 and Clint Scovel 2 1 Free University of Berlin / Zuse Institute Berlin, Germany 2 California Institute of Technology,

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

QUASI-BAYESIAN ESTIMATION OF LARGE GAUSSIAN GRAPHICAL MODELS. (Apr. 2018; first draft Dec. 2015) 1. Introduction

QUASI-BAYESIAN ESTIMATION OF LARGE GAUSSIAN GRAPHICAL MODELS. (Apr. 2018; first draft Dec. 2015) 1. Introduction QUASI-BAYESIAN ESTIMATION OF LARGE GAUSSIAN GRAPHICAL MODELS YVES F. ATCHADÉ Apr. 08; first draft Dec. 05 Abstract. This paper deals with the Bayesian estimation of high dimensional Gaussian graphical

More information

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs De-biasing the Lasso: Optimal Sample Size for Gaussian Designs Adel Javanmard USC Marshall School of Business Data Science and Operations department Based on joint work with Andrea Montanari Oct 2015 Adel

More information

DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University

DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS. By Subhashis Ghosal North Carolina State University Submitted to the Annals of Statistics DISCUSSION: COVERAGE OF BAYESIAN CREDIBLE SETS By Subhashis Ghosal North Carolina State University First I like to congratulate the authors Botond Szabó, Aad van der

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Robustness to Parametric Assumptions in Missing Data Models

Robustness to Parametric Assumptions in Missing Data Models Robustness to Parametric Assumptions in Missing Data Models Bryan Graham NYU Keisuke Hirano University of Arizona April 2011 Motivation Motivation We consider the classic missing data problem. In practice

More information

D I S C U S S I O N P A P E R

D I S C U S S I O N P A P E R I N S T I T U T D E S T A T I S T I Q U E B I O S T A T I S T I Q U E E T S C I E N C E S A C T U A R I E L L E S ( I S B A ) UNIVERSITÉ CATHOLIQUE DE LOUVAIN D I S C U S S I O N P A P E R 2014/06 Adaptive

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

PROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Brittleness and Robustness of Bayesian Inference

Brittleness and Robustness of Bayesian Inference Brittleness and Robustness of Bayesian Inference Tim Sullivan with Houman Owhadi and Clint Scovel (Caltech) Mathematics Institute, University of Warwick Predictive Modelling Seminar University of Warwick,

More information

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Exchangeability. Peter Orbanz. Columbia University

Exchangeability. Peter Orbanz. Columbia University Exchangeability Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent noise Peter

More information

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Confidence Intervals for Low-dimensional Parameters with High-dimensional Data Cun-Hui Zhang and Stephanie S. Zhang Rutgers University and Columbia University September 14, 2012 Outline Introduction Methodology

More information

Sample Size Requirement For Some Low-Dimensional Estimation Problems

Sample Size Requirement For Some Low-Dimensional Estimation Problems Sample Size Requirement For Some Low-Dimensional Estimation Problems Cun-Hui Zhang, Rutgers University September 10, 2013 SAMSI Thanks for the invitation! Acknowledgements/References Sun, T. and Zhang,

More information

arxiv: v1 [math.st] 13 Feb 2017

arxiv: v1 [math.st] 13 Feb 2017 Adaptive posterior contraction rates for the horseshoe arxiv:72.3698v [math.st] 3 Feb 27 Stéphanie van der Pas,, Botond Szabó,,, and Aad van der Vaart, Leiden University and Budapest University of Technology

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

Overall Objective Priors

Overall Objective Priors Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University

More information

Robust estimation, efficiency, and Lasso debiasing

Robust estimation, efficiency, and Lasso debiasing Robust estimation, efficiency, and Lasso debiasing Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics WHOA-PSI workshop Washington University in St. Louis Aug 12, 2017 Po-Ling

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Bayesian Aggregation for Extraordinarily Large Dataset

Bayesian Aggregation for Extraordinarily Large Dataset Bayesian Aggregation for Extraordinarily Large Dataset Guang Cheng 1 Department of Statistics Purdue University www.science.purdue.edu/bigdata Department Seminar Statistics@LSE May 19, 2017 1 A Joint Work

More information

A Bayesian Nonparametric Hierarchical Framework for Uncertainty Quantification in Simulation

A Bayesian Nonparametric Hierarchical Framework for Uncertainty Quantification in Simulation Submitted to Operations Research manuscript Please, provide the manuscript number! Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information

Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Probabilistic machine learning group, Aalto University  Bayesian theory and methods, approximative integration, model Aki Vehtari, Aalto University, Finland Probabilistic machine learning group, Aalto University http://research.cs.aalto.fi/pml/ Bayesian theory and methods, approximative integration, model assessment and

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

arxiv: v2 [stat.me] 16 Aug 2018

arxiv: v2 [stat.me] 16 Aug 2018 arxiv:1807.06539v [stat.me] 16 Aug 018 On the Beta Prime Prior for Scale Parameters in High-Dimensional Bayesian Regression Models Ray Bai Malay Ghosh August 0, 018 Abstract We study high-dimensional Bayesian

More information

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: Alexandre Belloni (Duke) + Kengo Kato (Tokyo) Uniform Post Selection Inference for LAD Regression and Other Z-estimation problems. ArXiv: 1304.0282 Victor MIT, Economics + Center for Statistics Co-authors: Alexandre Belloni (Duke) + Kengo Kato (Tokyo)

More information

Dirichlet Processes: Tutorial and Practical Course

Dirichlet Processes: Tutorial and Practical Course Dirichlet Processes: Tutorial and Practical Course (updated) Yee Whye Teh Gatsby Computational Neuroscience Unit University College London August 2007 / MLSS Yee Whye Teh (Gatsby) DP August 2007 / MLSS

More information

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich

DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO. By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich Submitted to the Annals of Statistics DISCUSSION OF A SIGNIFICANCE TEST FOR THE LASSO By Peter Bühlmann, Lukas Meier and Sara van de Geer ETH Zürich We congratulate Richard Lockhart, Jonathan Taylor, Ryan

More information

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31 Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information

Thanks. Presentation help: B. Narasimhan, N. El Karoui, J-M Corcuera Grant Support: NIH, NSF, ARC (via P. Hall) p.2

Thanks. Presentation help: B. Narasimhan, N. El Karoui, J-M Corcuera Grant Support: NIH, NSF, ARC (via P. Hall) p.2 p.1 Collaborators: Felix Abramowich Yoav Benjamini David Donoho Noureddine El Karoui Peter Forrester Gérard Kerkyacharian Debashis Paul Dominique Picard Bernard Silverman Thanks Presentation help: B. Narasimhan,

More information

BERNSTEIN POLYNOMIAL DENSITY ESTIMATION 1265 bounded away from zero and infinity. Gibbs sampling techniques to compute the posterior mean and other po

BERNSTEIN POLYNOMIAL DENSITY ESTIMATION 1265 bounded away from zero and infinity. Gibbs sampling techniques to compute the posterior mean and other po The Annals of Statistics 2001, Vol. 29, No. 5, 1264 1280 CONVERGENCE RATES FOR DENSITY ESTIMATION WITH BERNSTEIN POLYNOMIALS By Subhashis Ghosal University of Minnesota Mixture models for density estimation

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

A Tight Excess Risk Bound via a Unified PAC-Bayesian- Rademacher-Shtarkov-MDL Complexity

A Tight Excess Risk Bound via a Unified PAC-Bayesian- Rademacher-Shtarkov-MDL Complexity A Tight Excess Risk Bound via a Unified PAC-Bayesian- Rademacher-Shtarkov-MDL Complexity Peter Grünwald Centrum Wiskunde & Informatica Amsterdam Mathematical Institute Leiden University Joint work with

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Nonparametric Bayes tensor factorizations for big data

Nonparametric Bayes tensor factorizations for big data Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional

More information

Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors

Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Ann Inst Stat Math (29) 61:835 859 DOI 1.17/s1463-8-168-2 Asymptotic properties of posterior distributions in nonparametric regression with non-gaussian errors Taeryon Choi Received: 1 January 26 / Revised:

More information

arxiv: v2 [math.st] 12 Feb 2008

arxiv: v2 [math.st] 12 Feb 2008 arxiv:080.460v2 [math.st] 2 Feb 2008 Electronic Journal of Statistics Vol. 2 2008 90 02 ISSN: 935-7524 DOI: 0.24/08-EJS77 Sup-norm convergence rate and sign concentration property of Lasso and Dantzig

More information

Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems

Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems Introduction to Bayesian learning Lecture 2: Bayesian methods for (un)supervised problems Anne Sabourin, Ass. Prof., Telecom ParisTech September 2017 1/78 1. Lecture 1 Cont d : Conjugate priors and exponential

More information

Bayesian Variable Selection for Skewed Heteroscedastic Response

Bayesian Variable Selection for Skewed Heteroscedastic Response Bayesian Variable Selection for Skewed Heteroscedastic Response arxiv:1602.09100v2 [stat.me] 3 Jul 2017 Libo Wang 1, Yuanyuan Tang 1, Debajyoti Sinha 1, Debdeep Pati 1, and Stuart Lipsitz 2 1 Department

More information

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

More information

Can we do statistical inference in a non-asymptotic way? 1

Can we do statistical inference in a non-asymptotic way? 1 Can we do statistical inference in a non-asymptotic way? 1 Guang Cheng 2 Statistics@Purdue www.science.purdue.edu/bigdata/ ONR Review Meeting@Duke Oct 11, 2017 1 Acknowledge NSF, ONR and Simons Foundation.

More information

Foundations of Nonparametric Bayesian Methods

Foundations of Nonparametric Bayesian Methods 1 / 27 Foundations of Nonparametric Bayesian Methods Part II: Models on the Simplex Peter Orbanz http://mlg.eng.cam.ac.uk/porbanz/npb-tutorial.html 2 / 27 Tutorial Overview Part I: Basics Part II: Models

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Sparsity and the truncated l 2 -norm

Sparsity and the truncated l 2 -norm Lee H. Dicker Department of Statistics and Biostatistics utgers University ldicker@stat.rutgers.edu Abstract Sparsity is a fundamental topic in highdimensional data analysis. Perhaps the most common measures

More information

Estimating Sparse High Dimensional Linear Models using Global-Local Shrinkage

Estimating Sparse High Dimensional Linear Models using Global-Local Shrinkage Estimating Sparse High Dimensional Linear Models using Global-Local Shrinkage Daniel F. Schmidt Centre for Biostatistics and Epidemiology The University of Melbourne Monash University May 11, 2017 Outline

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables

A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,

More information

MIT Spring 2016

MIT Spring 2016 MIT 18.655 Dr. Kempthorne Spring 2016 1 MIT 18.655 Outline 1 2 MIT 18.655 Decision Problem: Basic Components P = {P θ : θ Θ} : parametric model. Θ = {θ}: Parameter space. A{a} : Action space. L(θ, a) :

More information