Horseshoe, Lasso and Related Shrinkage Methods
|
|
- Cory Watson
- 5 years ago
- Views:
Transcription
1 Readings Chapter 15 Christensen Merlise Clyde October 15, 2015
2 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso
3 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Y α, β s, φ N(1 n α + X s β s, I n /φ)
4 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Y α, β s, φ N(1 n α + X s β s, I n /φ) β s α, φ, τ, λ N(0, diag(τ 2 )/φ)
5 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Y α, β s, φ N(1 n α + X s β s, I n /φ) β s α, φ, τ, λ N(0, diag(τ 2 )/φ) τ , τ 2 p α, φ, λ iid Exp(λ 2 /2)
6 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Y α, β s, φ N(1 n α + X s β s, I n /φ) β s α, φ, τ, λ N(0, diag(τ 2 )/φ) τ , τ 2 p α, φ, λ iid Exp(λ 2 /2) p(α, φ) 1/φ
7 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Y α, β s, φ N(1 n α + X s β s, I n /φ) β s α, φ, τ, λ N(0, diag(τ 2 )/φ) τ , τ 2 p α, φ, λ iid Exp(λ 2 /2) p(α, φ) 1/φ Can show that β j φ, λ iid DE(λ φ) 0 1 e 1 2 φ β2 s 2πs λ 2 2 e λ 2 s 2 ds = λφ1/2 2 e λφ1/2 β
8 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Y α, β s, φ N(1 n α + X s β s, I n /φ) β s α, φ, τ, λ N(0, diag(τ 2 )/φ) τ , τ 2 p α, φ, λ iid Exp(λ 2 /2) p(α, φ) 1/φ Can show that β j φ, λ iid DE(λ φ) 0 1 e 1 2 φ β2 s 2πs λ 2 2 e λ 2 s 2 ds = λφ1/2 2 e λφ1/2 β Scale Mixture of Normals (Andrews and Mallows 1974)
9 Gibbs Sampling Prior λ 2 Gamma(r, δ)
10 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ)
11 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ) Full Conditionals β s τ, φ, λ, Y N(, )
12 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ) Full Conditionals β s τ, φ, λ, Y N(, ) φ τ, β s, λ, Y G(, )
13 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ) Full Conditionals β s τ, φ, λ, Y N(, ) φ τ, β s, λ, Y G(, ) λ 2 β s, φ, τ 2, Y G(, )
14 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ) Full Conditionals β s τ, φ, λ, Y N(, ) φ τ, β s, λ, Y G(, ) λ 2 β s, φ, τ 2, Y G(, ) 1/τ 2 j β s, φ, λ, Y InvGaussian(, )
15 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ) Full Conditionals β s τ, φ, λ, Y N(, ) φ τ, β s, λ, Y G(, ) λ 2 β s, φ, τ 2, Y G(, ) 1/τ 2 j β s, φ, λ, Y InvGaussian(, ) X InvGaussian(µ, λ) λ 2 f (x) = 2π x 3/2 e 1 λ 2 (x µ) 2 2 µ 2 x x > 0
16 Gibbs Sampling Prior λ 2 Gamma(r, δ) Integrate out α: α Y, φ N(ȳ, 1/(nφ) Full Conditionals β s τ, φ, λ, Y N(, ) φ τ, β s, λ, Y G(, ) λ 2 β s, φ, τ 2, Y G(, ) 1/τ 2 j β s, φ, λ, Y InvGaussian(, ) X InvGaussian(µ, λ) λ 2 f (x) = 2π x 3/2 e 1 λ 2 (x µ) 2 2 µ 2 x x > 0 Homework Nextweek: Derive the full conditionals for β s, φ, 1/τ 2 see
17 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ
18 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ τ j λ iid C + (0, λ 2 ) (difference is CPS notation)
19 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ τ j λ iid C + (0, λ 2 ) (difference is CPS notation) λ C + (0, 1)
20 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ τ j λ iid C + (0, λ 2 ) (difference is CPS notation) λ C + (0, 1) p(α, φ) 1/φ)
21 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ τ j λ iid C + (0, λ 2 ) (difference is CPS notation) λ C + (0, 1) p(α, φ) 1/φ) In the case λ = φ = 1 and with canonical representation Y = Iβ + ɛ
22 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ τ j λ iid C + (0, λ 2 ) (difference is CPS notation) λ C + (0, 1) p(α, φ) 1/φ) In the case λ = φ = 1 and with canonical representation Y = Iβ + ɛ E[β i Y] = 1 0 (1 κ i )yi p(κ i Y) dκ i = (1 E[κ yi ])yi where κ i = 1/(1 + τi 2 ) shrinkage factor
23 Horseshoe Carvalho, Polson & Scott propose Prior Distribution on β s φ, τ N(0 p, diag(τ 2 ) ) φ τ j λ iid C + (0, λ 2 ) (difference is CPS notation) λ C + (0, 1) p(α, φ) 1/φ) In the case λ = φ = 1 and with canonical representation Y = Iβ + ɛ E[β i Y] = 1 0 (1 κ i )yi p(κ i Y) dκ i = (1 E[κ yi ])yi where κ i = 1/(1 + τi 2 ) shrinkage factor Half-Cauchy prior induces a Beta(1/2, 1/2) distribution on κ i a priori
24 Horseshoe Beta(1/2, 1/2) density x
25 Prior Comparison (from PSC)
26 Bounded Influence Normal means case Y i iid N(βi, 1) (Equivalent to Canonical case) Posterior mean E[β y] = y + d dy log m(y) where m(y) is the predictive denisty under the prior (known λ)
27 Bounded Influence Normal means case Y i iid N(βi, 1) (Equivalent to Canonical case) Posterior mean E[β y] = y + d dy log m(y) where m(y) is the predictive denisty under the prior (known λ) HS has Bounded Influence: lim y d log m(y) = 0 dy
28 Bounded Influence Normal means case Y i iid N(βi, 1) (Equivalent to Canonical case) Posterior mean E[β y] = y + d dy log m(y) where m(y) is the predictive denisty under the prior (known λ) HS has Bounded Influence: lim y d log m(y) = 0 dy lim y E[β y) y (MLE)
29 Bounded Influence Normal means case Y i iid N(βi, 1) (Equivalent to Canonical case) Posterior mean E[β y] = y + d dy log m(y) where m(y) is the predictive denisty under the prior (known λ) HS has Bounded Influence: lim y d log m(y) = 0 dy lim y E[β y) y (MLE) DE is also bounded influence, but bound does not decay to zero in tails
30 R packages The monomvn package in R includes blasso bhs See Diabetes.R code
31 Simulation Study with Diabetes Data RMSE OLS LASSO HORSESHOE
32 Other Options Range of other scale mixtures used
33 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee)
34 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee) λ Gamma(α, η) then β s j GDP(ξ = η/α, α)
35 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee) λ Gamma(α, η) then β s j GDP(ξ = η/α, α) f (β s j ) = 1 2ξ (1 + βs j ξα ) (1+α) see
36 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee) λ Gamma(α, η) then β s j GDP(ξ = η/α, α) f (β s j ) = 1 2ξ (1 + βs j ξα ) (1+α) see Normal-Exponenetial-Gamma (Griffen & Brown 2005) λ 2 Gamma(α, η)
37 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee) λ Gamma(α, η) then β s j GDP(ξ = η/α, α) f (β s j ) = 1 2ξ (1 + βs j ξα ) (1+α) see Normal-Exponenetial-Gamma (Griffen & Brown 2005) λ 2 Gamma(α, η) Bridge - Power Exponential Priors (Stable mixing density)
38 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee) λ Gamma(α, η) then β s j GDP(ξ = η/α, α) f (β s j ) = 1 2ξ (1 + βs j ξα ) (1+α) see Normal-Exponenetial-Gamma (Griffen & Brown 2005) λ 2 Gamma(α, η) Bridge - Power Exponential Priors (Stable mixing density) See the monomvn package on CRAN
39 Other Options Range of other scale mixtures used Generalized Double Pareto (Armagan, Dunson & Lee) λ Gamma(α, η) then β s j GDP(ξ = η/α, α) f (β s j ) = 1 2ξ (1 + βs j ξα ) (1+α) see Normal-Exponenetial-Gamma (Griffen & Brown 2005) λ 2 Gamma(α, η) Bridge - Power Exponential Priors (Stable mixing density) See the monomvn package on CRAN Choice of prior? Properties? Fan & Li (JASA 2001) discuss Variable selection via nonconcave penalties and oracle properties
40 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero)
41 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage)
42 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage) Bayesian Posterior does not assign any probability to β s j = 0
43 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage) Bayesian Posterior does not assign any probability to β s j = 0 selection based on posterior mode ad hoc rule - Select if κ i <.5)
44 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage) Bayesian Posterior does not assign any probability to β s j = 0 selection based on posterior mode ad hoc rule - Select if κ i <.5) See article by Datta & Ghosh http: //ba.stat.cmu.edu/journal/forthcoming/datta.pdf
45 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage) Bayesian Posterior does not assign any probability to β s j = 0 selection based on posterior mode ad hoc rule - Select if κ i <.5) See article by Datta & Ghosh http: //ba.stat.cmu.edu/journal/forthcoming/datta.pdf Selection solved as a post-analysis decision problem
46 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage) Bayesian Posterior does not assign any probability to β s j = 0 selection based on posterior mode ad hoc rule - Select if κ i <.5) See article by Datta & Ghosh http: //ba.stat.cmu.edu/journal/forthcoming/datta.pdf Selection solved as a post-analysis decision problem Selection part of model uncertainty add prior
47 Choice of Estimator & Selection? Posterior Mode (may set some coefficients to zero) Posterior Mean (no selection, just shrinkage) Bayesian Posterior does not assign any probability to β s j = 0 selection based on posterior mode ad hoc rule - Select if κ i <.5) See article by Datta & Ghosh http: //ba.stat.cmu.edu/journal/forthcoming/datta.pdf Selection solved as a post-analysis decision problem Selection part of model uncertainty add prior probability that β s j = 0 and combine with decision problem Remember all models are wrong, but some may be useful!
Lasso & Bayesian Lasso
Readings Chapter 15 Christensen Merlise Clyde October 6, 2015 Lasso Tibshirani (JRSS B 1996) proposed estimating coefficients through L 1 constrained least squares Least Absolute Shrinkage and Selection
More informationOr How to select variables Using Bayesian LASSO
Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection
More informationBayesian Sparse Linear Regression with Unknown Symmetric Error
Bayesian Sparse Linear Regression with Unknown Symmetric Error Minwoo Chae 1 Joint work with Lizhen Lin 2 David B. Dunson 3 1 Department of Mathematics, The University of Texas at Austin 2 Department of
More informationBayesian shrinkage approach in variable selection for mixed
Bayesian shrinkage approach in variable selection for mixed effects s GGI Statistics Conference, Florence, 2015 Bayesian Variable Selection June 22-26, 2015 Outline 1 Introduction 2 3 4 Outline Introduction
More informationBayesian Grouped Horseshoe Regression with Application to Additive Models
Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu 1,2, Daniel F. Schmidt 1, Enes Makalic 1, Guoqi Qian 2, John L. Hopper 1 1 Centre for Epidemiology and Biostatistics,
More informationHierarchical sparsity priors for regression models
Hierarchical sparsity priors for regression models arxiv:37.53v [stat.me] Jul 4 J.E. Griffin and P.J. Brown School of Mathematics, Statistics and Actuarial Science, University of Kent, Canterbury CT 7NF,
More informationPartial factor modeling: predictor-dependent shrinkage for linear regression
modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework
More informationEfficient sampling for Gaussian linear regression with arbitrary priors
Efficient sampling for Gaussian linear regression with arbitrary priors P. Richard Hahn Arizona State University and Jingyu He Booth School of Business, University of Chicago and Hedibert F. Lopes INSPER
More informationOn the Half-Cauchy Prior for a Global Scale Parameter
Bayesian Analysis (2012) 7, Number 2, pp. 1 16 On the Half-Cauchy Prior for a Global Scale Parameter Nicholas G. Polson and James G. Scott Abstract. This paper argues that the half-cauchy distribution
More informationIntegrated Anlaysis of Genomics Data
Integrated Anlaysis of Genomics Data Elizabeth Jennings July 3, 01 Abstract In this project, we integrate data from several genomic platforms in a model that incorporates the biological relationships between
More informationBayesian Grouped Horseshoe Regression with Application to Additive Models
Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne
More informationarxiv: v2 [stat.me] 27 Oct 2012
The Bayesian Bridge Nicholas G. Polson University of Chicago arxiv:119.2279v2 [stat.me] 27 Oct 212 James G. Scott Jesse Windle University of Texas at Austin First Version: July 211 This Version: October
More informationOn the half-cauchy prior for a global scale parameter
On the half-cauchy prior for a global scale parameter Nicholas G. Polson University of Chicago arxiv:1104.4937v2 [stat.me] 25 Sep 2011 James G. Scott The University of Texas at Austin First draft: June
More informationBayesian variable selection and classification with control of predictive values
Bayesian variable selection and classification with control of predictive values Eleni Vradi 1, Thomas Jaki 2, Richardus Vonk 1, Werner Brannath 3 1 Bayer AG, Germany, 2 Lancaster University, UK, 3 University
More informationEstimating Sparse High Dimensional Linear Models using Global-Local Shrinkage
Estimating Sparse High Dimensional Linear Models using Global-Local Shrinkage Daniel F. Schmidt Centre for Biostatistics and Epidemiology The University of Melbourne Monash University May 11, 2017 Outline
More informationGeometric ergodicity of the Bayesian lasso
Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components
More informationarxiv: v2 [stat.me] 16 Aug 2018
arxiv:1807.06539v [stat.me] 16 Aug 018 On the Beta Prime Prior for Scale Parameters in High-Dimensional Bayesian Regression Models Ray Bai Malay Ghosh August 0, 018 Abstract We study high-dimensional Bayesian
More informationarxiv: v1 [stat.me] 6 Jul 2017
Sparsity information and regularization in the horseshoe and other shrinkage priors arxiv:77.694v [stat.me] 6 Jul 7 Juho Piironen and Aki Vehtari Helsinki Institute for Information Technology, HIIT Department
More informationPackage horseshoe. November 8, 2016
Title Implementation of the Horseshoe Prior Version 0.1.0 Package horseshoe November 8, 2016 Description Contains functions for applying the horseshoe prior to highdimensional linear regression, yielding
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationChapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1
Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in
More informationFeature selection with high-dimensional data: criteria and Proc. Procedures
Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June
More informationarxiv: v3 [stat.me] 10 Aug 2018
Lasso Meets Horseshoe: A Survey arxiv:1706.10179v3 [stat.me] 10 Aug 2018 Anindya Bhadra 250 N. University St., West Lafayette, IN 47907. e-mail: bhadra@purdue.edu Jyotishka Datta 1 University of Arkansas,
More informationScalable MCMC for the horseshoe prior
Scalable MCMC for the horseshoe prior Anirban Bhattacharya Department of Statistics, Texas A&M University Joint work with James Johndrow and Paolo Orenstein September 7, 2018 Cornell Day of Statistics
More informationarxiv: v2 [stat.me] 24 Feb 2018
Dynamic Shrinkage Processes Daniel R. Kowal, David S. Matteson, and David Ruppert arxiv:1707.00763v2 [stat.me] 24 Feb 2018 February 27, 2018 Abstract We propose a novel class of dynamic shrinkage processes
More informationSparse Estimation and Uncertainty with Application to Subgroup Analysis
Sparse Estimation and Uncertainty with Application to Subgroup Analysis Marc Ratkovic Dustin Tingley First Draft: March 2015 This Draft: October 20, 2016 Abstract We introduce a Bayesian method, LASSOplus,
More informationHandling Sparsity via the Horseshoe
Handling Sparsity via the Carlos M. Carvalho Booth School of Business The University of Chicago Chicago, IL 60637 Nicholas G. Polson Booth School of Business The University of Chicago Chicago, IL 60637
More informationarxiv: v3 [stat.me] 26 Feb 2012
Data augmentation for non-gaussian regression models using variance-mean mixtures arxiv:113.547v3 [stat.me] 26 Feb 212 Nicholas G. Polson Booth School of Business University of Chicago James G. Scott University
More informationExtended Bayesian Information Criteria for Model Selection with Large Model Spaces
Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18 Variable
More informationRobust Bayesian Regression
Readings: Hoff Chapter 9, West JRSSB 1984, Fúquene, Pérez & Pericchi 2015 Duke University November 17, 2016 Body Fat Data: Intervals w/ All Data Response % Body Fat and Predictor Waist Circumference 95%
More informationBayesian Regression with Undirected Network Predictors with an Application to Brain Connectome Data
Bayesian Regression with Undirected Network Predictors with an Application to Brain Connectome Data Sharmistha Guha and Abel Rodriguez January 24, 208 Abstract This article proposes a Bayesian approach
More informationRobust Bayesian Simple Linear Regression
Robust Bayesian Simple Linear Regression October 1, 2008 Readings: GIll 4 Robust Bayesian Simple Linear Regression p.1/11 Body Fat Data: Intervals w/ All Data 95% confidence and prediction intervals for
More informationHierarchical Bayesian Modeling with Approximate Bayesian Computation
Hierarchical Bayesian Modeling with Approximate Bayesian Computation Jessi Cisewski Carnegie Mellon University June 2014 Goal of lecture Bring the ideas of Hierarchical Bayesian Modeling (HBM), Approximate
More informationHierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31
Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,
More informationLecture 16: Mixtures of Generalized Linear Models
Lecture 16: Mixtures of Generalized Linear Models October 26, 2006 Setting Outline Often, a single GLM may be insufficiently flexible to characterize the data Setting Often, a single GLM may be insufficiently
More informationLasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices
Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,
More informationRegularization with variance-mean mixtures
Regularization with variance-mean mixtures Nick Polson University of Chicago James Scott University of Texas at Austin Workshop on Sensing and Analysis of High-Dimensional Data Duke University July 2011
More informationA sparse factor analysis model for high dimensional latent spaces
A sparse factor analysis model for high dimensional latent spaces Chuan Gao Institute of Genome Sciences & Policy Duke University cg48@duke.edu Barbara E Engelhardt Department of Biostatistics & Bioinformatics
More informationBayesian Models for Structured Sparse Estimation via Set Cover Prior
Bayesian Models for Structured Sparse Estimation via Set Cover Prior Xianghang Liu,, Xinhua Zhang,3, Tibério Caetano,,3 Machine Learning Research Group, National ICT Australia, Sydney and Canberra, Australia
More informationTime Series and Dynamic Models
Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The
More informationModel Choice. Hoff Chapter 9, Clyde & George Model Uncertainty StatSci, Hoeting et al BMA StatSci. October 27, 2015
Model Choice Hoff Chapter 9, Clyde & George Model Uncertainty StatSci, Hoeting et al BMA StatSci October 27, 2015 Topics Variable Selection / Model Choice Stepwise Methods Model Selection Criteria Model
More informationSelf-adaptive Lasso and its Bayesian Estimation
Self-adaptive Lasso and its Bayesian Estimation Jian Kang 1 and Jian Guo 2 1. Department of Biostatistics, University of Michigan 2. Department of Statistics, University of Michigan Abstract In this paper,
More informationMarkov Chain Monte Carlo (MCMC)
Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can
More informationIntroduction to Machine Learning
Introduction to Machine Learning Linear Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationStatistical Inference
Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park
More informationFlexible Learning on the Sphere Via Adaptive Needlet Shrinkage and Selection
Flexible Learning on the Sphere Via Adaptive Needlet Shrinkage and Selection James G. Scott McCombs School of Business University of Texas at Austin Austin, TX 78712 James.Scott@mccombs.utexas.edu Original:
More informationA New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables
A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationNonparametric Bayes Uncertainty Quantification
Nonparametric Bayes Uncertainty Quantification David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & ONR Review of Bayes Intro to Nonparametric Bayes
More informationA Hybrid Bayesian Approach for Genome-Wide Association Studies on Related Individuals
Bioinformatics Advance Access published August 30 015 A Hybrid Bayesian Approach for Genome-Wide Association Studies on Related Individuals A. Yazdani 1 D. B. Dunson 1 Human Genetic Center University of
More informationBayes Estimators & Ridge Regression
Readings Chapter 14 Christensen Merlise Clyde September 29, 2015 How Good are Estimators? Quadratic loss for estimating β using estimator a L(β, a) = (β a) T (β a) How Good are Estimators? Quadratic loss
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationLecture 13 Fundamentals of Bayesian Inference
Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up
More informationConsistent high-dimensional Bayesian variable selection via penalized credible regions
Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationClassical and Bayesian inference
Classical and Bayesian inference AMS 132 January 18, 2018 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 18, 2018 1 / 9 Sampling from a Bernoulli Distribution Theorem (Beta-Bernoulli
More informationAspects of Feature Selection in Mass Spec Proteomic Functional Data
Aspects of Feature Selection in Mass Spec Proteomic Functional Data Phil Brown University of Kent Newton Institute 11th December 2006 Based on work with Jeff Morris, and others at MD Anderson, Houston
More informationST 740: Model Selection
ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 11, Issue 2 2012 Article 5 COMPUTATIONAL STATISTICAL METHODS FOR GENOMICS AND SYSTEMS BIOLOGY Bayesian Sparsity-Path-Analysis of Genetic
More informationLuminosity Functions
Department of Statistics, Harvard University May 15th, 2012 Introduction: What is a Luminosity Function? Figure: A galaxy cluster. Introduction: Project Goal The Luminosity Function specifies the relative
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationarxiv: v4 [stat.co] 12 May 2018
Fully Bayesian Logistic Regression with Hyper-LASSO Priors for High-dimensional Feature Selection Longhai Li and Weixin Yao arxiv:405.339v4 [stat.co] 2 May 208 May 5, 208 Abstract High-dimensional feature
More informationNONPARAMETRIC BAYESIAN DENSITY ESTIMATION ON A CIRCLE
NONPARAMETRIC BAYESIAN DENSITY ESTIMATION ON A CIRCLE MARK FISHER Incomplete. Abstract. This note describes nonparametric Bayesian density estimation for circular data using a Dirichlet Process Mixture
More informationMachine Learning - MT & 5. Basis Expansion, Regularization, Validation
Machine Learning - MT 2016 4 & 5. Basis Expansion, Regularization, Validation Varun Kanade University of Oxford October 19 & 24, 2016 Outline Basis function expansion to capture non-linear relationships
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationRegularized Skew-Normal Regression. K. Shutes & C.J. Adcock. March 31, 2014
Regularized Skew-Normal Regression K. Shutes & C.J. Adcock March 31, 214 Abstract This paper considers the impact of using the regularisation techniques for the analysis of the (extended) skew-normal distribution.
More informationComputationally Efficient MCMC for Hierarchical Bayesian Inverse Problems
Computationally Efficient MCMC for Hierarchical Bayesian Inverse Problems Andrew Brown 1, Arvind Saibaba 2, Sarah Vallélian 2,3 SAMSI January 28, 2017 Supported by SAMSI Visiting Research Fellowship NSF
More informationModel Checking and Improvement
Model Checking and Improvement Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin Model Checking All models are wrong but some models are useful George E. P. Box So far we have looked at a number
More informationRegularized Skew-Normal Regression
MPRA Munich Personal RePEc Archive Regularized Skew-Normal Regression Karl Shutes and Chris Adcock Coventry University, University of Sheffield 24. November 213 Online at http://mpra.ub.uni-muenchen.de/52367/
More informationSparsity and the truncated l 2 -norm
Lee H. Dicker Department of Statistics and Biostatistics utgers University ldicker@stat.rutgers.edu Abstract Sparsity is a fundamental topic in highdimensional data analysis. Perhaps the most common measures
More informationBayesian methods in economics and finance
1/26 Bayesian methods in economics and finance Linear regression: Bayesian model selection and sparsity priors Linear Regression 2/26 Linear regression Model for relationship between (several) independent
More informationOn the Use of Cauchy Prior Distributions for Bayesian Logistic Regression
Bayesian Analysis (2018) 13, Number 2, pp. 359 383 On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression Joyee Ghosh,YingboLi, and Robin Mitra Abstract. In logistic regression, separation
More informationStat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC
Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline
More informationINTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP
INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology
More informationNearest Neighbor Gaussian Processes for Large Spatial Data
Nearest Neighbor Gaussian Processes for Large Spatial Data Abhi Datta 1, Sudipto Banerjee 2 and Andrew O. Finley 3 July 31, 2017 1 Department of Biostatistics, Bloomberg School of Public Health, Johns
More informationarxiv: v1 [stat.me] 3 Aug 2014
DECOUPLING SHRINKAGE AND SELECTION IN BAYESIAN LINEAR MODELS: A POSTERIOR SUMMARY PERSPECTIVE By P. Richard Hahn and Carlos M. Carvalho Booth School of Business and McCombs School of Business arxiv:1408.0464v1
More informationOn the Use of Cauchy Prior Distributions for Bayesian Logistic Regression
On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression Joyee Ghosh, Yingbo Li, Robin Mitra arxiv:157.717v2 [stat.me] 9 Feb 217 Abstract In logistic regression, separation occurs when
More informationPower-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models
Power-Expected-Posterior Priors for Variable Selection in Gaussian Linear Models Ioannis Ntzoufras, Department of Statistics, Athens University of Economics and Business, Athens, Greece; e-mail: ntzoufras@aueb.gr.
More informationEfficient MCMC Sampling for Hierarchical Bayesian Inverse Problems
Efficient MCMC Sampling for Hierarchical Bayesian Inverse Problems Andrew Brown 1,2, Arvind Saibaba 3, Sarah Vallélian 2,3 CCNS Transition Workshop SAMSI May 5, 2016 Supported by SAMSI Visiting Research
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationAnalysis Methods for Supersaturated Design: Some Comparisons
Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs
More informationBayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson
Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n
More informationSparse Factor-Analytic Probit Models
Sparse Factor-Analytic Probit Models By JAMES G. SCOTT Department of Statistical Science, Duke University, Durham, North Carolina 27708-0251, U.S.A. james@stat.duke.edu PAUL R. HAHN Department of Statistical
More informationSimulating Random Variables
Simulating Random Variables Timothy Hanson Department of Statistics, University of South Carolina Stat 740: Statistical Computing 1 / 23 R has many built-in random number generators... Beta, gamma (also
More informationSparse additive Gaussian process with soft interactions. Title
Sparse additive Gaussian process with soft interactions arxiv:1607.02670v1 [stat.ml] 9 Jul 2016 Garret Vo Department of Industrial and Manufacturing Engineering, Florida State University, Tallahassee,
More informationOne-parameter models
One-parameter models Patrick Breheny January 22 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/17 Introduction Binomial data is not the only example in which Bayesian solutions can be worked
More informationA Confidence Region Approach to Tuning for Variable Selection
A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized
More informationDirichlet-Laplace priors for optimal shrinkage
Dirichlet-Laplace priors for optimal shrinkage Anirban Bhattacharya, Debdeep Pati, Natesh S. Pillai, David B. Dunson January 22, 2014 arxiv:1401.5398v1 [math.st] 21 Jan 2014 Abstract Penalized regression
More informationRegression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh.
Regression Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh September 24 (All of the slides in this course have been adapted from previous versions
More informationSparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference
Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which
More informationConsistent change point estimation. Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University
Consistent change point estimation Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University Outline of presentation Change point problem = variable selection
More informationMixtures of Prior Distributions
Mixtures of Prior Distributions Hoff Chapter 9, Liang et al 2007, Hoeting et al (1999), Clyde & George (2004) November 10, 2016 Bartlett s Paradox The Bayes factor for comparing M γ to the null model:
More informationIEOR165 Discussion Week 5
IEOR165 Discussion Week 5 Sheng Liu University of California, Berkeley Feb 19, 2016 Outline 1 1st Homework 2 Revisit Maximum A Posterior 3 Regularization IEOR165 Discussion Sheng Liu 2 About 1st Homework
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationBayesian Lasso Binary Quantile Regression
Noname manuscript No. (will be inserted by the editor) Bayesian Lasso Binary Quantile Regression Dries F Benoit Rahim Alhamzawi Keming Yu the date of receipt and acceptance should be inserted later Abstract
More informationMixtures of Prior Distributions
Mixtures of Prior Distributions Hoff Chapter 9, Liang et al 2007, Hoeting et al (1999), Clyde & George (2004) November 9, 2017 Bartlett s Paradox The Bayes factor for comparing M γ to the null model: BF
More informationarxiv: v1 [math.st] 13 Feb 2017
Adaptive posterior contraction rates for the horseshoe arxiv:72.3698v [math.st] 3 Feb 27 Stéphanie van der Pas,, Botond Szabó,,, and Aad van der Vaart, Leiden University and Budapest University of Technology
More informationNew Statistical Methods That Improve on MLE and GLM Including for Reserve Modeling GARY G VENTER
New Statistical Methods That Improve on MLE and GLM Including for Reserve Modeling GARY G VENTER MLE Going the Way of the Buggy Whip Used to be gold standard of statistical estimation Minimum variance
More informationThe Horseshoe Estimator for Sparse Signals
The Horseshoe Estimator for Sparse Signals Carlos M. Carvalho Nicholas G. Polson University of Chicago Booth School of Business James G. Scott Duke University Department of Statistical Science October
More informationHomework 1: Solutions
Homework 1: Solutions Statistics 413 Fall 2017 Data Analysis: Note: All data analysis results are provided by Michael Rodgers 1. Baseball Data: (a) What are the most important features for predicting players
More information