Extended Bayesian Information Criteria for Model Selection with Large Model Spaces

Size: px
Start display at page:

Download "Extended Bayesian Information Criteria for Model Selection with Large Model Spaces"

Transcription

1 Extended Bayesian Information Criteria for Model Selection with Large Model Spaces Jiahua Chen, University of British Columbia Zehua Chen, National University of Singapore (Biometrika, 2008) 1 / 18

2 Variable Selection Classical Criteria Akaike Information Criterion (Akaike, 1973) Bayesian Information Criterion (Schwarz, 1978) Cross-Validation (Stone, 1974) Large P, small n ex. genome-wide association studies The above criteria: select many spurious covariates A great challenge 2 / 18

3 Bayesian Information Criterion (BIC) {(y i, x i ) : i = 1,..., n}: independent observations f (y i x i, θ), θ R P L n (θ) = n i=1 f (y i x i, θ) s {1,..., P} θ(s): components outside s being set to 0 BIC(s) = 2 log L n {ˆθ(s)} + ν(s) log n, where ˆθ(s): MLE of θ(s), ν(s): # of components in s. 3 / 18

4 BIC: approximate Bayes approach S: model space under consideration. p(s): prior probability of model s. Posterior probability of model s is Pr(s Z) Pr(Z s)p(s) where Pr(Z s) = Pr(Z θ(s), s)pr(θ(s) s)dθ(s). 4 / 18

5 BIC: approximate Bayes approach s that maximizes log Pr(Z s)p(s) Approximation to the integral followed by some simplifications So, log Pr(Z s) = log L n {ˆθ(s)} ν(s) 2 log n + O(1) log Pr(Z s)p(s) = log Pr(Z s) + log p(s) = log L n {ˆθ(s)} ν(s) 2 = 1 BIC + O(1) + log p(s) 2 log n + O(1) + log p(s) 5 / 18

6 Constant prior assumption behind BIC An implicit assumption underlying the use of BIC p(s) is constant for s (uniform prior) This may not be reasonable with large P S j : class of models containing j covariates ex. S 1 : the collection of models with a single covariate. Under constant prior, Pr(S j ) τ(s j ), where τ(s j ): size of S j Ex. P = 1000, τ(s 1 ) = 1000 vs τ(s 2 ) = /2 The constant prior prefers larger models. 6 / 18

7 Extended BIC S = P j=1 S j Pr(S j ) τ ξ (S j ), where 0 ξ 1 Pr(s S j ) = 1/τ(S j ) for any s S j (all models in S j are equally plausible) p(s) τ γ (S j ) for s S j, where γ = 1 ξ Extended BIC BIC γ (s) = 2 log L n { ˆ θ(s)} + ν(s) log n + 2γ log τ(s j ), where 0 γ 1 7 / 18

8 Consistency of the Extended BIC P = p n = O(n κ ) as n for κ > 0 Model where ɛ n N(0, σ 2 I n ) y n = X n β + ɛ n, (1) s 0 : the smallest subset of {1,..., p n } such that µ n = E(y n ) = X n (s 0 )β(s 0 ), where X n (s 0 ),β(s 0 ): design matrix and coefficients corresponding to s 0 Call s 0 the true submodel 8 / 18

9 Asymptotic identifiability K 0 = ν(s 0 ) H n (s): projection matrix of X n (s) n (s) = µ n H n (s)µ n 2 Condition 1: Asymptotic identifiability. Model 1 with true submodel s 0 is asymptotically identifiable if lim min{(log n n) 1 n (s) : s s 0, ν(s) K 0 } =. Any other model of comparable size cannot predict the response well. 9 / 18

10 Consistency of the Extended BIC Theorem Assume that p n = O(n κ ) for some constant κ. If γ > 1 1/(2κ), then, under the asymptotic identifiability condition, Pr [min{bic γ (s) : ν(s) = j, s s 0 } > BIC γ (s 0 )] 1, for j = 1,..., K, as n (K is an upper bound for K 0 ) If γ = 1, consistent for κ > 0 If γ = 0, consistent for κ < 0.5 (original BIC) 10 / 18

11 Simulation Studies γ = 0, 0.5 and 1 cannot afford to compute BIC γ (s) for all possible s LASSO (Tibshirani, 1996), SCAD (Fan and Li, 2001) Increase λ gradually BIC γ computed for sequence of nested models If P n, a tournament (Chen and Chen, 2008) 1. randomly partition covariates (each: n/2 covariates) 2. apply LASSO or SCAD 3. pool nonzero components 4. repeat if necessary 11 / 18

12 Simulation Studies Linear model in all cases, y i = x T i (s)β(s) + ɛ i, for some s, where ɛ i N(0, 1) Case 1: P = 50, n = 200 (i) cor(x j, x k ) = ρ for all (j, k) (ii) cor(x j, x k ) = ρ for k = j ± 1 (iii) cor(x j, x k ) = ρ k j β(s) = (7, 9, 4, 3, 10, 2, 2, 1) T /10 12 / 18

13 Results of Case 1 s : selected by the extended BIC Positive Selection Rate (PSR): ν(s s )/ν(s) False Discovery Rate (FDR): ν(s s)/ν(s ) 13 / 18

14 Simulation Studies: Case 2 P = 1000, n = groups of size 50 First 10 groups: generated as Case 1 The other 10 groups: generated from a discrete distribution β(s) = (10, 7, 5, 3, 2) T /10 The tournament approach covriates: partitioned into subsets of size From each subset, 12 covariates selected (final candidate covariates) 14 / 18

15 Results of Case 2 15 / 18

16 Simulation Studies: Case 3 A dataset from a genome-wide association study y: mrna level of a particular gene X : 1414 single-nucleotide polymorphisms (SNPs) n = 233 Setting 1. y: randomly permuted Setting 2. y: randomly generated under assumption of no association Setting 3. y: generated from linear model β(s) = ( 1.56, 1.09, 1.22, 0.06, 0.08, 0.012, 0.067, 0.047, 0.07, 0.05) T Setting 4. β(s) = ( 0.31, 0.23, 0.42, 0.32, 0.33, 0.26, 0.41, 0.29, 0.35, 0.69) T 16 / 18

17 Results of Case 3 17 / 18

18 Summary: BIC γ γ = 1 effectively controls FDR consistent for κ > 0 (pn = O(n κ )) γ = 0 (original BIC) a slightly better PSR, much worse FDR consistent for κ < 0.5 (pn = O(n κ )) BIC γ incur a small loss in PSR, but tightly control FDR Another way of choosing γ 1. P = n κ 2. γ = 1 1/(2κ) 18 / 18

Feature selection with high-dimensional data: criteria and Proc. Procedures

Feature selection with high-dimensional data: criteria and Proc. Procedures Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June

More information

Extended Bayesian Information Criteria for Gaussian Graphical Models

Extended Bayesian Information Criteria for Gaussian Graphical Models Extended Bayesian Information Criteria for Gaussian Graphical Models Rina Foygel University of Chicago rina@uchicago.edu Mathias Drton University of Chicago drton@uchicago.edu Abstract Gaussian graphical

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Generalized Linear Models and Its Asymptotic Properties

Generalized Linear Models and Its Asymptotic Properties for High Dimensional Generalized Linear Models and Its Asymptotic Properties April 21, 2012 for High Dimensional Generalized L Abstract Literature Review In this talk, we present a new prior setting for

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Extended BIC for small-n-large-p sparse GLM

Extended BIC for small-n-large-p sparse GLM Extended BIC for small-n-large-p sparse GLM By JIAHUA CHEN Department of Statistics, University of British Columbia, Vancouver, British Columbia, V6T 1Z2 Canada jhchen@stat.ubc.ca and ZEHUA CHEN Department

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

3 Comparison with Other Dummy Variable Methods

3 Comparison with Other Dummy Variable Methods Stats 300C: Theory of Statistics Spring 2018 Lecture 11 April 25, 2018 Prof. Emmanuel Candès Scribe: Emmanuel Candès, Michael Celentano, Zijun Gao, Shuangning Li 1 Outline Agenda: Knockoffs 1. Introduction

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

On High-Dimensional Cross-Validation

On High-Dimensional Cross-Validation On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING

More information

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Article Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices Fei Jin 1,2 and Lung-fei Lee 3, * 1 School of Economics, Shanghai University of Finance and Economics,

More information

Overview. and data transformations of gene expression data. Toy 2-d Clustering Example. K-Means. Motivation. Model-based clustering

Overview. and data transformations of gene expression data. Toy 2-d Clustering Example. K-Means. Motivation. Model-based clustering Model-based clustering and data transformations of gene expression data Walter L. Ruzzo University of Washington UW CSE Computational Biology Group 2 Toy 2-d Clustering Example K-Means? 3 4 Hierarchical

More information

An Adaptive LASSO-Penalized BIC

An Adaptive LASSO-Penalized BIC An Adaptive LASSO-Penalized BIC Sakyajit Bhattacharya and Paul D. McNicholas arxiv:1406.1332v1 [stat.me] 5 Jun 2014 Dept. of Mathematics and Statistics, University of uelph, Canada. Abstract Mixture models

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Structure estimation for Gaussian graphical models

Structure estimation for Gaussian graphical models Faculty of Science Structure estimation for Gaussian graphical models Steffen Lauritzen, University of Copenhagen Department of Mathematical Sciences Minikurs TUM 2016 Lecture 3 Slide 1/48 Overview of

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection. Sung Y. Park CUHK

Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection. Sung Y. Park CUHK Econ 5150: Applied Econometrics Dynamic Demand Model Model Selection Sung Y. Park CUHK Simple dynamic models A typical simple model: y t = α 0 + α 1 y t 1 + α 2 y t 2 + x tβ 0 x t 1β 1 + u t, where y t

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Guarding against Spurious Discoveries in High Dimension. Jianqing Fan

Guarding against Spurious Discoveries in High Dimension. Jianqing Fan in High Dimension Jianqing Fan Princeton University with Wen-Xin Zhou September 30, 2016 Outline 1 Introduction 2 Spurious correlation and random geometry 3 Goodness Of Spurious Fit (GOSF) 4 Asymptotic

More information

Selecting explanatory variables with the modified version of Bayesian Information Criterion

Selecting explanatory variables with the modified version of Bayesian Information Criterion Selecting explanatory variables with the modified version of Bayesian Information Criterion Institute of Mathematics and Computer Science, Wrocław University of Technology, Poland in cooperation with J.K.Ghosh,

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Tuning Parameter Selection in L1 Regularized Logistic Regression

Tuning Parameter Selection in L1 Regularized Logistic Regression Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 Tuning Parameter Selection in L1 Regularized Logistic Regression Shujing Shi Virginia Commonwealth University

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

Consistent Model Selection Criteria on High Dimensions

Consistent Model Selection Criteria on High Dimensions Journal of Machine Learning Research 13 (2012) 1037-1057 Submitted 6/11; Revised 1/12; Published 4/12 Consistent Model Selection Criteria on High Dimensions Yongdai Kim Department of Statistics Seoul National

More information

The annals of Statistics (2006)

The annals of Statistics (2006) High dimensional graphs and variable selection with the Lasso Nicolai Meinshausen and Peter Buhlmann The annals of Statistics (2006) presented by Jee Young Moon Feb. 19. 2010 High dimensional graphs and

More information

False Discovery Control in Spatial Multiple Testing

False Discovery Control in Spatial Multiple Testing False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Variable Selection in Finite Mixture of Regression Models

Variable Selection in Finite Mixture of Regression Models JASA jasa v.2007/01/31 Prn:9/07/2007; 16:36 F:jasatm05338r2.tex; (Aust) p. 1 Variable Selection in Finite Mixture of Regression Models 1 60 Abbas KHALILI and Jiahua CHEN In the applications of finite mixture

More information

Model Choice Lecture 2

Model Choice Lecture 2 Model Choice Lecture 2 Brian Ripley http://www.stats.ox.ac.uk/ ripley/modelchoice/ Bayesian approaches Note the plural I think Bayesians are rarely Bayesian in their model selection, and Geisser s quote

More information

arxiv: v2 [stat.me] 29 May 2017

arxiv: v2 [stat.me] 29 May 2017 Robust Variable and Interaction Selection for Logistic Regression and Multiple Index Models arxiv:1611.08649v2 [stat.me] 29 May 2017 Yang Li, Jun S. Liu Department of Statistics, Harvard University May

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1 Motivation big p, small n

More information

CS Homework 3. October 15, 2009

CS Homework 3. October 15, 2009 CS 294 - Homework 3 October 15, 2009 If you have questions, contact Alexandre Bouchard (bouchard@cs.berkeley.edu) for part 1 and Alex Simma (asimma@eecs.berkeley.edu) for part 2. Also check the class website

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

On Mixture Regression Shrinkage and Selection via the MR-LASSO

On Mixture Regression Shrinkage and Selection via the MR-LASSO On Mixture Regression Shrinage and Selection via the MR-LASSO Ronghua Luo, Hansheng Wang, and Chih-Ling Tsai Guanghua School of Management, Peing University & Graduate School of Management, University

More information

High-dimensional regression modeling

High-dimensional regression modeling High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making

More information

A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models

A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models A Consistent Model Selection Criterion for L 2 -Boosting in High-Dimensional Sparse Linear Models Tze Leung Lai, Stanford University Ching-Kang Ing, Academia Sinica, Taipei Zehao Chen, Lehman Brothers

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

A Confidence Region Approach to Tuning for Variable Selection

A Confidence Region Approach to Tuning for Variable Selection A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

A Kullback-Leibler Divergence for Bayesian Model Comparison with Applications to Diabetes Studies. Chen-Pin Wang, UTHSCSA Malay Ghosh, U.

A Kullback-Leibler Divergence for Bayesian Model Comparison with Applications to Diabetes Studies. Chen-Pin Wang, UTHSCSA Malay Ghosh, U. A Kullback-Leibler Divergence for Bayesian Model Comparison with Applications to Diabetes Studies Chen-Pin Wang, UTHSCSA Malay Ghosh, U. Florida Lehmann Symposium, May 9, 2011 1 Background KLD: the expected

More information

Regression I: Mean Squared Error and Measuring Quality of Fit

Regression I: Mean Squared Error and Measuring Quality of Fit Regression I: Mean Squared Error and Measuring Quality of Fit -Applied Multivariate Analysis- Lecturer: Darren Homrighausen, PhD 1 The Setup Suppose there is a scientific problem we are interested in solving

More information

arxiv: v1 [stat.me] 27 Nov 2012

arxiv: v1 [stat.me] 27 Nov 2012 A LASSO-Penalized BIC for Mixture Model Selection Sakyajit Bhattacharya and Paul D. McNicholas arxiv:1211.6451v1 [stat.me] 27 Nov 2012 Department of Mathematics & Statistics, University of Guelph. Abstract

More information

Group exponential penalties for bi-level variable selection

Group exponential penalties for bi-level variable selection for bi-level variable selection Department of Biostatistics Department of Statistics University of Kentucky July 31, 2011 Introduction In regression, variables can often be thought of as grouped: Indicator

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Bayesian shrinkage approach in variable selection for mixed

Bayesian shrinkage approach in variable selection for mixed Bayesian shrinkage approach in variable selection for mixed effects s GGI Statistics Conference, Florence, 2015 Bayesian Variable Selection June 22-26, 2015 Outline 1 Introduction 2 3 4 Outline Introduction

More information

Frequentist Accuracy of Bayesian Estimates

Frequentist Accuracy of Bayesian Estimates Frequentist Accuracy of Bayesian Estimates Bradley Efron Stanford University Bayesian Inference Parameter: µ Ω Observed data: x Prior: π(µ) Probability distributions: Parameter of interest: { fµ (x), µ

More information

A Robust Approach to Regularized Discriminant Analysis

A Robust Approach to Regularized Discriminant Analysis A Robust Approach to Regularized Discriminant Analysis Moritz Gschwandtner Department of Statistics and Probability Theory Vienna University of Technology, Austria Österreichische Statistiktage, Graz,

More information

Machine Learning 2017

Machine Learning 2017 Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

STAT 740: Testing & Model Selection

STAT 740: Testing & Model Selection STAT 740: Testing & Model Selection Timothy Hanson Department of Statistics, University of South Carolina Stat 740: Statistical Computing 1 / 34 Testing & model choice, likelihood-based A common way to

More information

Exam: high-dimensional data analysis January 20, 2014

Exam: high-dimensional data analysis January 20, 2014 Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish

More information

Consistent change point estimation. Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University

Consistent change point estimation. Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University Consistent change point estimation Chi Tim Ng, Chonnam National University Woojoo Lee, Inha University Youngjo Lee, Seoul National University Outline of presentation Change point problem = variable selection

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY ECO 513 Fall 2008 MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY SIMS@PRINCETON.EDU 1. MODEL COMPARISON AS ESTIMATING A DISCRETE PARAMETER Data Y, models 1 and 2, parameter vectors θ 1, θ 2.

More information

QTL Mapping I: Overview and using Inbred Lines

QTL Mapping I: Overview and using Inbred Lines QTL Mapping I: Overview and using Inbred Lines Key idea: Looking for marker-trait associations in collections of relatives If (say) the mean trait value for marker genotype MM is statisically different

More information

How the mean changes depends on the other variable. Plots can show what s happening...

How the mean changes depends on the other variable. Plots can show what s happening... Chapter 8 (continued) Section 8.2: Interaction models An interaction model includes one or several cross-product terms. Example: two predictors Y i = β 0 + β 1 x i1 + β 2 x i2 + β 12 x i1 x i2 + ɛ i. How

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

To Hold Out or Not. Frank Schorfheide and Ken Wolpin. April 4, University of Pennsylvania

To Hold Out or Not. Frank Schorfheide and Ken Wolpin. April 4, University of Pennsylvania Frank Schorfheide and Ken Wolpin University of Pennsylvania April 4, 2011 Introduction Randomized controlled trials (RCTs) to evaluate policies, e.g., cash transfers for school attendance, have become

More information

A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces

A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces Journal of Machine Learning Research 17 (2016) 1-26 Submitted 6/14; Revised 5/15; Published 4/16 A Consistent Information Criterion for Support Vector Machines in Diverging Model Spaces Xiang Zhang Yichao

More information

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis

Consistency of test based method for selection of variables in high dimensional two group discriminant analysis https://doi.org/10.1007/s42081-019-00032-4 ORIGINAL PAPER Consistency of test based method for selection of variables in high dimensional two group discriminant analysis Yasunori Fujikoshi 1 Tetsuro Sakurai

More information

Choosing among models

Choosing among models Eco 515 Fall 2014 Chris Sims Choosing among models September 18, 2014 c 2014 by Christopher A. Sims. This document is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported

More information

Prior Distributions for the Variable Selection Problem

Prior Distributions for the Variable Selection Problem Prior Distributions for the Variable Selection Problem Sujit K Ghosh Department of Statistics North Carolina State University http://www.stat.ncsu.edu/ ghosh/ Email: ghosh@stat.ncsu.edu Overview The Variable

More information

Outlier detection in ARIMA and seasonal ARIMA models by. Bayesian Information Type Criteria

Outlier detection in ARIMA and seasonal ARIMA models by. Bayesian Information Type Criteria Outlier detection in ARIMA and seasonal ARIMA models by Bayesian Information Type Criteria Pedro Galeano and Daniel Peña Departamento de Estadística Universidad Carlos III de Madrid 1 Introduction The

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

STK-IN4300 Statistical Learning Methods in Data Science

STK-IN4300 Statistical Learning Methods in Data Science Outline of the lecture Linear Methods for Regression Linear Regression Models and Least Squares Subset selection STK-IN4300 Statistical Learning Methods in Data Science Riccardo De Bin debin@math.uio.no

More information

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation

Shrinkage Tuning Parameter Selection in Precision Matrices Estimation arxiv:0909.1123v1 [stat.me] 7 Sep 2009 Shrinkage Tuning Parameter Selection in Precision Matrices Estimation Heng Lian Division of Mathematical Sciences School of Physical and Mathematical Sciences Nanyang

More information

Lecture 8 Genomic Selection

Lecture 8 Genomic Selection Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection

More information

Bayesian Model Comparison

Bayesian Model Comparison BS2 Statistical Inference, Lecture 11, Hilary Term 2009 February 26, 2009 Basic result An accurate approximation Asymptotic posterior distribution An integral of form I = b a e λg(y) h(y) dy where h(y)

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

Robust Bayesian Variable Selection for Modeling Mean Medical Costs

Robust Bayesian Variable Selection for Modeling Mean Medical Costs Robust Bayesian Variable Selection for Modeling Mean Medical Costs Grace Yoon 1,, Wenxin Jiang 2, Lei Liu 3 and Ya-Chen T. Shih 4 1 Department of Statistics, Texas A&M University 2 Department of Statistics,

More information

F & B Approaches to a simple model

F & B Approaches to a simple model A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys

More information

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics arxiv:1603.08163v1 [stat.ml] 7 Mar 016 Farouk S. Nathoo, Keelin Greenlaw,

More information

Seminar über Statistik FS2008: Model Selection

Seminar über Statistik FS2008: Model Selection Seminar über Statistik FS2008: Model Selection Alessia Fenaroli, Ghazale Jazayeri Monday, April 2, 2008 Introduction Model Choice deals with the comparison of models and the selection of a model. It can

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Horseshoe, Lasso and Related Shrinkage Methods

Horseshoe, Lasso and Related Shrinkage Methods Readings Chapter 15 Christensen Merlise Clyde October 15, 2015 Bayesian Lasso Park & Casella (JASA 2008) and Hans (Biometrika 2010) propose Bayesian versions of the Lasso Bayesian Lasso Park & Casella

More information

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space

Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Asymptotic Equivalence of Regularization Methods in Thresholded Parameter Space Jinchi Lv Data Sciences and Operations Department Marshall School of Business University of Southern California http://bcf.usc.edu/

More information

Large-Scale Multiple Testing of Correlations

Large-Scale Multiple Testing of Correlations Large-Scale Multiple Testing of Correlations T. Tony Cai and Weidong Liu Abstract Multiple testing of correlations arises in many applications including gene coexpression network analysis and brain connectivity

More information

LTI Systems, Additive Noise, and Order Estimation

LTI Systems, Additive Noise, and Order Estimation LTI Systems, Additive oise, and Order Estimation Soosan Beheshti, Munther A. Dahleh Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts

More information

Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension

Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension Variable Selection in Multivariate Linear Regression Models with Fewer Observations than the Dimension (Last Modified: November 3 2008) Mariko Yamamura 1, Hirokazu Yanagihara 2 and Muni S. Srivastava 3

More information

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III)

Lattice Data. Tonglin Zhang. Spatial Statistics for Point and Lattice Data (Part III) Title: Spatial Statistics for Point Processes and Lattice Data (Part III) Lattice Data Tonglin Zhang Outline Description Research Problems Global Clustering and Local Clusters Permutation Test Spatial

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

arxiv: v1 [stat.me] 30 Dec 2017

arxiv: v1 [stat.me] 30 Dec 2017 arxiv:1801.00105v1 [stat.me] 30 Dec 2017 An ISIS screening approach involving threshold/partition for variable selection in linear regression 1. Introduction Yu-Hsiang Cheng e-mail: 96354501@nccu.edu.tw

More information

MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp

MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp Selection criteria Example Methods MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp Lecture 5, spring 2018 Model selection tools Mathematical Statistics / Centre

More information

Bayesian Asymptotics

Bayesian Asymptotics BS2 Statistical Inference, Lecture 8, Hilary Term 2008 May 7, 2008 The univariate case The multivariate case For large λ we have the approximation I = b a e λg(y) h(y) dy = e λg(y ) h(y ) 2π λg (y ) {

More information

High-dimensional variable selection via tilting

High-dimensional variable selection via tilting High-dimensional variable selection via tilting Haeran Cho and Piotr Fryzlewicz September 2, 2010 Abstract This paper considers variable selection in linear regression models where the number of covariates

More information

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks (9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate

More information

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over

Decision theory. 1 We may also consider randomized decision rules, where δ maps observed data D to a probability distribution over Point estimation Suppose we are interested in the value of a parameter θ, for example the unknown bias of a coin. We have already seen how one may use the Bayesian method to reason about θ; namely, we

More information