Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson

Size: px
Start display at page:

Download "Bayesian variable selection via. Penalized credible regions. Brian Reich, NCSU. Joint work with. Howard Bondell and Ander Wilson"

Transcription

1 Bayesian variable selection via penalized credible regions Brian Reich, NC State Joint work with Howard Bondell and Ander Wilson Brian Reich, NCSU Penalized credible regions 1

2 Motivation big p, small n problems are ubiquitous in modern applications. We propose a new approach that: is computationally efficient has good theoretical properties makes new connections between Bayesian and penalized regression methods performs well in high dimensions. In addition, we extend this approach to the challenging problem of selecting an appropriate set of confounders in a health effects study. Brian Reich, NCSU Penalized credible regions 2

3 Variable selection for linear regression Linear regression model: y i N(x T i β, σ 2 ). n observations and p predictors. Only a subset of predictors are relevant. There are many existing approaches for the p << n case, including penalization and Bayesian methods. Fewer methods have been proposed for the ultra-high dimensional case with p >> n. Brian Reich, NCSU Penalized credible regions 3

4 Penalization methods Minimize: n i=1 (y i x T i β) 2 + λj(β). LASSO: J(β) = p j=1 β j Adaptive LASSO: J(β) = p j=1 β j / ˆβ j q Elastic Net, SCAD, OSCAR,... λ chosen by AIC, BIC, CV, GCV. Achieves selection by setting some β j exactly to zero. Brian Reich, NCSU Penalized credible regions 4

5 Bayesian variable selection Candidate models are indexed by δ = (δ 1,, δ p ) T, where δ j = { 1 if xj is included in the model, 0 if x j is excluded from the model. The most common prior over model space is δ j iid Bern(π). The regression coefficients included in the model typically have Gaussian priors. Brian Reich, NCSU Penalized credible regions 5

6 Bayesian variable selection Bayes rule then gives the posterior of δ. We can select the model with highest posterior probability, or select variables with large marginal inclusion probability P(δ j = 1 y). When p is large, computing the probability for all possible δ is impossible. Stochastic search (SSVS) can be used to estimate these probabilities by including the model indictor δ as an unknown parameter updated in MCMC. Brian Reich, NCSU Penalized credible regions 6

7 Lindley s Paradox Posterior model probabilities can be sensitive to priors. Consider y N(β 1, 1) to test H 0 : β 1 = 0 vs. H 1 : β 1 0. Posterior probability of the null y= 0 y= 2 y= 3 Posterior 95% credible set y= 0 y= 2 y= Prior SD Prior SD (a) Posterior Probability in favor of Null for various prior standard deviations. (b) 95% Posterior Credible Set for various prior standard deviations. Brian Reich, NCSU Penalized credible regions 7

8 Other drawbacks Bayesian variable selection using SSVS requires: Choice of prior on model space (inclusion probabilities) Proper prior distributions for model parameters (regression coefficients and variances) Posterior threshold for inclusion probabilities MCMC convergence can be slow. Marginal inclusion probabilities may be poor under high correlation because highly correlated predictors may each show up equally often, each only a small number of times Brian Reich, NCSU Penalized credible regions 8

9 Posterior credible regions are less sensitive to the prior Posterior probability of the null y= 0 y= 2 y= 3 Posterior 95% credible set y= 0 y= 2 y= Prior SD Prior SD (a) Posterior Probability in favor of Null for various prior standard deviations. (b) 95% Posterior Credible Set for various prior standard deviations. Brian Reich, NCSU Penalized credible regions 9

10 Using credible regions for variable selection Denote C α R p as the (1 α) 100% credible region for β. We could consider any β C α as a feasible estimate. With p = 1, we remove x 1 if 0 C α. Generalizing to higher dimensions, we could select the element of C α with the most zeros. This is the sparsest feasible value. The following examples plot C α with p = 2 predictors. Brian Reich, NCSU Penalized credible regions 10

11 Drop both X 1 and X 2 β 2 β 1 Brian Reich, NCSU Penalized credible regions 11

12 Keep X 2, drop X 1 β 2 β 1 Brian Reich, NCSU Penalized credible regions 12

13 Keep both X 1 and X 2 β 2 β 1 Brian Reich, NCSU Penalized credible regions 13

14 Formalizing the approach Fit the full model using, say, priors β N(0, σ2 τ I p) and σ 2 = IG(0.01, 0.01). From this model fit, compute the elliptical credible region C α = {β : (β ˆβ) T Σ 1 (β ˆβ) C α } where ˆβ and Σ are the posterior mean and covariance. If τ is fixed, C α is the highest density region and ˆβ and Σ have closed forms (no MCMC). If τ has a prior, C α is still a valid region and a simple, short MCMC run gives ˆβ and Σ. Brian Reich, NCSU Penalized credible regions 14

15 Joint credible regions All points within the region may be feasible parameter values Among these, we seek a sparsest solution ˆβ = arg max β C α β 0 where β 0 is the number of non-zero elements in β. The chosen model for a given α is defined by the set of non-zero indices, A α n = {j : βj 0}. Brian Reich, NCSU Penalized credible regions 15

16 Problems with searching for sparsest solution The solution may not be unique, and requires a high-dimensional combinatorial search. Replace (Lv and Fan, 2009) L 0 by smooth bridge between L 0 and L 1, p j=1 ρ a( β j ), where for t [0, ) ρ a (t) = This has limits (a + 1)t a + t = ( ) ( ) t a I (t 0) + t. a + t a + t ρ 0 (t) = I (t 0) and ρ (t) = t. a 0 is the most relevant for our setting. Brian Reich, NCSU Penalized credible regions 16

17 Computation This penalty function is non-convex, so we consider a local linear approximation about the posterior mean ˆβ with ρ a( ˆβ j ) = ρ a ( β j ) ρ a ( ˆβ j ) + ρ a( ˆβ ( j ) β j ˆβ ) j, a(a+1) (a+ ˆβ j ) 2. Using Lagrangian gives β = arg min (β ˆβ) T Σ 1 (β ˆβ) + λ α p β j ( a + ˆβ j j=1 ) 2. There is a one-to-one correspondence between λ α and α. Brian Reich, NCSU Penalized credible regions 17

18 Computation For a 0, optimization becomes β arg min (β ˆβ) T Σ 1 (β ˆβ) + λ α This has the adaptive LASSO form. p β j ˆβ j 2. j=1 In fact, with a flat prior for β, this is equivalent to the adaptive LASSO. The LARS algorithm gives the full path for varying α. Brian Reich, NCSU Penalized credible regions 18

19 Selection consistency There is a one-to-one correspondence between α n and C n, where the credible set is defined by (β ˆβ) T Σ 1 (β ˆβ) C n. Varying C n gives a sequence of models A α n n. The true model is denoted A. Theorem: Under general conditions, if C n and n 1 C n 0, then the credible set method is consistent in variable selection, i.e. P(A α n n = A) 1 Also holds for p, but with p/n 0. Brian Reich, NCSU Penalized credible regions 19

20 Selection consistency What about p >> n? The posterior mean has the ridge regression form ˆβ = ( X T X + τi ) 1 X T y. If p/n 0, it can shown that ˆβ is not mean square consistent, and thus the method is not selection consistent. Brian Reich, NCSU Penalized credible regions 20

21 Selection consistency Consider marginal posterior thresholding Rectangular credible regions - not elliptical Theorem: Let τ and τ = O ( (n 2 log p ) 1/3 ) then the posterior thresholding approach is consistent in selection when the dimension p satisfies p = O (exp{n c }) for some 0 c < 1. Selection consistency for exponential growing dimension, p = o(exp{n}). Also applies to ridge regression with ridge parameter τ. Brian Reich, NCSU Penalized credible regions 21

22 Simulation study Linear regression model with N(0, 1) errors. n = 60 observations and p {50, 500, 2000} predictors. The predictors are also standard normal with AR(1) correlation and lag-1 correlation ρ {0.5, 0.8}. β = (0 10, B 1, 0 20, B 2, 0 p40 ) T, where B 1 and B 2 are vectors of length 5 with elements drawn uniformly on (-5,5). Results are based on 200 datasets for each of the 6 setups. Brian Reich, NCSU Penalized credible regions 22

23 Simulation study Consider ordering of predictors induced by: Joint credible regions Marginal posterior thresholding Stochastic Search (with various choices of prior) (A)LASSO Two measures of the reliability of ordering: ROC curve - measures sensitivity vs. specificity - related to type I error. PRC (Precision-Recall) curve - related to false discovery rate. Brian Reich, NCSU Penalized credible regions 23

24 Simulation study with p = 50 and n = 60 ρ = 0.5 (Top) and ρ = 0.8 (Bottom) sensitivity Cred. Set SSVS (f, f) SSVS (r, r) precision Cred. Set SSVS (f, f) SSVS (r, r) one.minus.specificity recall sensitivity Cred. Set SSVS (f, f) SSVS (r, r) precision Cred. Set SSVS (f, f) SSVS (r, r) one.minus.specificity recall Brian Reich, NCSU Penalized credible regions 24

25 Simulation study with p = 500 and n = 60 ρ = 0.5 (Top) and ρ = 0.8 (Bottom) sensitivity Cred. Set SSVS (f, f) SSVS (r, r) precision Cred. Set SSVS (f, f) SSVS (r, r) one.minus.specificity recall sensitivity Cred. Set SSVS (f, f) SSVS (r, r) precision Cred. Set SSVS (f, f) SSVS (r, r) one.minus.specificity recall Brian Reich, NCSU Penalized credible regions 25

26 Results with p = 500, n = 60, and ρ = 0.8 Model PRC Area CPU time (sec) Joint regions (0.007) 21 Marginal regions (0.007) 21 SSVS (fixed, fixed) (0.010) 1223 SSVS (random, fixed) (0.009) 1223 SSVS (fixed, random) (0.010) 1223 SSVS (random, random) (0.009) 1223 Brian Reich, NCSU Penalized credible regions 26

27 Simulation study with p = 2000 and n = 60 ρ = 0.5 (Top) and ρ = 0.8 (Bottom) sensitivity Cred. Set Lasso precision Cred. Set Lasso one.minus.specificity recall sensitivity Cred. Set Lasso precision Cred. Set Lasso one.minus.specificity recall Brian Reich, NCSU Penalized credible regions 27

28 Real data analysis Mouse Gene Expression (Lan et al., 2006) n =60 arrays (31 female, 29 male mice) and p =22,690 genes. Screen via correlations to top 2000 genes Fit with n = 55, leave out 5 for testing. Repeat 100 splits Response Model MSPE Model size PEPCK Joint regions 1.63 (0.10) 24.7 (1.19) Lasso 1.91 (0.13) 48.7 (1.15) GPAT Joint regions 3.44 (0.35) 8.56 (0.44) Lasso 3.52 (0.27) 53.8 (0.44) SCD1 Joint regions 1.84 (0.19) 37.6 (0.55) Lasso 1.64 (0.17) 51.1 (0.94) Brian Reich, NCSU Penalized credible regions 28

29 Extension to confounder adjustment Consider the linear model Y = β 0 + Xβ x + Zβ z + ϵ y. We are interested in the effect of the exposure X on the outcome Y. Z is a set of potential confounding variables, and are not of interest. For example, X is an environmental exposure and Z are other factors such as SES. Brian Reich, NCSU Penalized credible regions 29

30 Confounder adjustment Often there are many potential confounders. If confounders are omitted, ˆβ x may be biased. Including too many covariates increases the variance of ˆβ x. The ideal model includes all confounders (variables correlated with X and Y) and all additional explanatory variables (correlated with Y). Brian Reich, NCSU Penalized credible regions 30

31 Previous approaches to confounder adjustment Crainiceanu et al (2008) proposed a two-step approach, fitting an exposure model to identify variables correlated with X X = γ 0 + Zγ z + ϵ x, where ϵ x N(0, σ 2 xi ), (1) and outcome model to identify additional covariates Y = β 0 + Xβ x + Zβ z + ϵ y, where ϵ y N(0, σ 2 y I ). (2) Bayesian adjustment for confounding (BAC; Wang et al, 2012) combined these two steps and simultaneously fit the two models using SSVS. Covariates included in (1) are automatically included in (2). Brian Reich, NCSU Penalized credible regions 31

32 Our approach to confounder adjustment We separately fit the exposure and outcome models Y = β 0 + Xβ x + Zβ z + ϵ y and X = γ 0 + Zγ z + ϵ x. Let C β α and C γ α be (1 α) 100% posterior credible regions of β = (β 0, β x, β u ) and γ = (γ 0, γ z ). We wish to identify the set of variables A = {j β j 0} {j γ j 0}. The proposed estimate is β = arg max β C β α,γ C γ α ϵ(β β 2 x ) + A where ϵ is fixed and small. Brian Reich, NCSU Penalized credible regions 32

33 Penalized regression reformulation Using an approximation similar to the regression case gives β = arg max(β ˆβ) T Σ 1 Y (β ˆβ) + λ β p β j ( ˆβ ) 2. j + ˆγ j j=1 This resembles the adaptive lasso with weights ˆβ j + ˆγ j. These weights reduce the penalty on coefficients for variables that appear to be important in either the response or confounder regression. Brian Reich, NCSU Penalized credible regions 33

34 Oracle properties (fixed p) Theorem: Under general conditions for the linear model, if λ n / n 0 and λ n n 1/2 the penalized credible region confounder method is consistent in variable selection and asymptotically normal. This implies that the true model will be contained in the solution path obtained by varying λ with probability tending to one. Brian Reich, NCSU Penalized credible regions 34

35 Simulation We use the design of Wang et al (2012). Y N(Wβ, I ). W = (X, Z 1,..., Z 57 ). W i be independent N(0, Σ). Σ = {σ ij } with diagonals σ jj = 1 and σ jk = 0.7 j+k2 for j k = 0,..., 7 β x = β 1 = = β 14 = 0.1 and β 15 = = β 57 = 0. Brian Reich, NCSU Penalized credible regions 35

36 Simulation We compare 3 priors of the form β j N(0, σ 2 y /τ y ) (and analogously for γ j ) Flat: τ = 0. [ Empirical Bayes: ˆτ y = ˆσ y 2 / (p + 1) 1 p ˆβ ] j=0 j 2. Full Bayes: τ y Ga(0.001, 0.001). σ 2 y and σ 2 x have Gamma(0.001, 0.001) priors. For flat and empirical Bayes priors no MCMC is required. All models with elliptical credible regions can be fit with the LARS package in R. Brian Reich, NCSU Penalized credible regions 36

37 Simulation We compare with: True model: Y = Xβ x + 14 j=1 Z jβ j + ϵ. Full model: Y = Xβ x + 57 j=1 Z jβ j + ϵ. Bayesian model averaging () with noninformative priors on β. Bayesian adjustment for confounding (BAC) method of Wang et al (2012). Adaptive LASSO. Brian Reich, NCSU Penalized credible regions 37

38 Simulation Results - Bias Mean Bias, n=60 Mean Bias, n= True Full (flat) (EB) (Gamma) Mean Bias, n=200 BAC Adpt. LASSO True Full (flat) (EB) (Gamma) Mean Bias, n=500 BAC Adpt. LASSO True Full (flat) (EB) (Gamma) BAC Adpt. LASSO True Full (flat) (EB) (Gamma) BAC Adpt. LASSO Brian Reich, NCSU Penalized credible regions 38

39 Simulation Results - MSE Mean MSE, n=60 Mean MSE, n= True Full (flat) (EB) (Gamma) Mean MSE, n=200 BAC Adpt. LASSO True Full (flat) (EB) (Gamma) Mean MSE, n=500 BAC Adpt. LASSO True Full (flat) (EB) (Gamma) BAC Adpt. LASSO True Full (flat) (EB) (Gamma) BAC Adpt. LASSO Brian Reich, NCSU Penalized credible regions 39

40 Simulation Results - Computing time Mean CPU Time (seconds), n=60 Mean CPU Time (seconds), n= True Full (flat) (EB) (Gamma) Mean CPU Time (seconds), n=200 BAC Adpt. LASSO True Full (flat) (EB) (Gamma) Mean CPU Time (seconds), n=500 BAC Adpt. LASSO True Full (flat) (EB) (Gamma) BAC Adpt. LASSO True Full (flat) (EB) (Gamma) BAC Adpt. LASSO Brian Reich, NCSU Penalized credible regions 40

41 Analysis of air pollution data We estimate the effect of PM 2.5 birth weight in Mecklenburg County, NC. Data from the State Center for Health Statistics includes birth records and covariates for Potential confounders include: season of birth, mean temperature in first trimester, mean dewpoint in first trimester, and several maternal health and SES variables. We used the credible regions method with empirical Bayes priors. Brian Reich, NCSU Penalized credible regions 41

42 Estimated PM effect Coefficient for PM Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Hispanic Black White The five methods give different results. The credible region estimates are similar to OLS. Brian Reich, NCSU Penalized credible regions 42

43 Confounder selection Mean Dewpoint in First Trimester Mean Temperature in First Trimester Inclusion Probability Inclusion Probability Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Hispanic Black White Hispanic Black White Birth in Fall Birth in Spring Inclusion Probability Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Cred Reg ALASSO OLS OLS PM only Inclusion Probability Cred Reg ALASSO OLS OLS PM only Hispanic Black White Hispanic Black White Brian Reich, NCSU Penalized credible regions 43

44 Summary We have proposed new approaches for variable and confounder selection. The methods are computationally efficient, theoretically justified, and have good empirical performance. References: Bondell, Reich. Consistent high-dimensional Bayesian variable selection via penalized credible regions. JASA. Wilson, Reich. Confounder selection via penalized credible regions. Submitted. Support: EPA - R835228, NIH - 5R01ES , NSF Thanks! Brian Reich, NCSU Penalized credible regions 44

Consistent high-dimensional Bayesian variable selection via penalized credible regions

Consistent high-dimensional Bayesian variable selection via penalized credible regions Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from

More information

Chapter 3. Linear Models for Regression

Chapter 3. Linear Models for Regression Chapter 3. Linear Models for Regression Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 Email: weip@biostat.umn.edu PubH 7475/8475 c Wei Pan Linear

More information

Feature selection with high-dimensional data: criteria and Proc. Procedures

Feature selection with high-dimensional data: criteria and Proc. Procedures Feature selection with high-dimensional data: criteria and Procedures Zehua Chen Department of Statistics & Applied Probability National University of Singapore Conference in Honour of Grace Wahba, June

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:

More information

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA

The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Lecture 14: Variable Selection - Beyond LASSO

Lecture 14: Variable Selection - Beyond LASSO Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)

More information

The MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010

The MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010 Department of Biostatistics Department of Statistics University of Kentucky August 2, 2010 Joint work with Jian Huang, Shuangge Ma, and Cun-Hui Zhang Penalized regression methods Penalized methods have

More information

Iterative Selection Using Orthogonal Regression Techniques

Iterative Selection Using Orthogonal Regression Techniques Iterative Selection Using Orthogonal Regression Techniques Bradley Turnbull 1, Subhashis Ghosal 1 and Hao Helen Zhang 2 1 Department of Statistics, North Carolina State University, Raleigh, NC, USA 2 Department

More information

Stability and the elastic net

Stability and the elastic net Stability and the elastic net Patrick Breheny March 28 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/32 Introduction Elastic Net Our last several lectures have concentrated on methods for

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001)

Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Paper Review: Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties by Jianqing Fan and Runze Li (2001) Presented by Yang Zhao March 5, 2010 1 / 36 Outlines 2 / 36 Motivation

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Shrinkage Methods: Ridge and Lasso

Shrinkage Methods: Ridge and Lasso Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Estimating subgroup specific treatment effects via concave fusion

Estimating subgroup specific treatment effects via concave fusion Estimating subgroup specific treatment effects via concave fusion Jian Huang University of Iowa April 6, 2016 Outline 1 Motivation and the problem 2 The proposed model and approach Concave pairwise fusion

More information

Consistent Group Identification and Variable Selection in Regression with Correlated Predictors

Consistent Group Identification and Variable Selection in Regression with Correlated Predictors Consistent Group Identification and Variable Selection in Regression with Correlated Predictors Dhruv B. Sharma, Howard D. Bondell and Hao Helen Zhang Abstract Statistical procedures for variable selection

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

On High-Dimensional Cross-Validation

On High-Dimensional Cross-Validation On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING

More information

A Confidence Region Approach to Tuning for Variable Selection

A Confidence Region Approach to Tuning for Variable Selection A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized

More information

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression

Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Stepwise Searching for Feature Variables in High-Dimensional Linear Regression Qiwei Yao Department of Statistics, London School of Economics q.yao@lse.ac.uk Joint work with: Hongzhi An, Chinese Academy

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

Nonconcave Penalized Likelihood with A Diverging Number of Parameters Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu Department of Statistics, University of Illinois at Urbana-Champaign WHOA-PSI, Aug, 2017 St. Louis, Missouri 1 / 30 Background Variable

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

More information

Regularization and Variable Selection via the Elastic Net

Regularization and Variable Selection via the Elastic Net p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction

More information

Prior Distributions for the Variable Selection Problem

Prior Distributions for the Variable Selection Problem Prior Distributions for the Variable Selection Problem Sujit K Ghosh Department of Statistics North Carolina State University http://www.stat.ncsu.edu/ ghosh/ Email: ghosh@stat.ncsu.edu Overview The Variable

More information

Penalized Regression

Penalized Regression Penalized Regression Deepayan Sarkar Penalized regression Another potential remedy for collinearity Decreases variability of estimated coefficients at the cost of introducing bias Also known as regularization

More information

Bayesian Grouped Horseshoe Regression with Application to Additive Models

Bayesian Grouped Horseshoe Regression with Application to Additive Models Bayesian Grouped Horseshoe Regression with Application to Additive Models Zemei Xu, Daniel F. Schmidt, Enes Makalic, Guoqi Qian, and John L. Hopper Centre for Epidemiology and Biostatistics, Melbourne

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Bayesian shrinkage approach in variable selection for mixed

Bayesian shrinkage approach in variable selection for mixed Bayesian shrinkage approach in variable selection for mixed effects s GGI Statistics Conference, Florence, 2015 Bayesian Variable Selection June 22-26, 2015 Outline 1 Introduction 2 3 4 Outline Introduction

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013) A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical

More information

Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR

Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR Howard D. Bondell and Brian J. Reich Department of Statistics, North Carolina State University,

More information

Homogeneity Pursuit. Jianqing Fan

Homogeneity Pursuit. Jianqing Fan Jianqing Fan Princeton University with Tracy Ke and Yichao Wu http://www.princeton.edu/ jqfan June 5, 2014 Get my own profile - Help Amazing Follow this author Grace Wahba 9 Followers Follow new articles

More information

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences

Biostatistics-Lecture 16 Model Selection. Ruibin Xi Peking University School of Mathematical Sciences Biostatistics-Lecture 16 Model Selection Ruibin Xi Peking University School of Mathematical Sciences Motivating example1 Interested in factors related to the life expectancy (50 US states,1969-71 ) Per

More information

Machine Learning for Economists: Part 4 Shrinkage and Sparsity

Machine Learning for Economists: Part 4 Shrinkage and Sparsity Machine Learning for Economists: Part 4 Shrinkage and Sparsity Michal Andrle International Monetary Fund Washington, D.C., October, 2018 Disclaimer #1: The views expressed herein are those of the authors

More information

Generalized Linear Models and Its Asymptotic Properties

Generalized Linear Models and Its Asymptotic Properties for High Dimensional Generalized Linear Models and Its Asymptotic Properties April 21, 2012 for High Dimensional Generalized L Abstract Literature Review In this talk, we present a new prior setting for

More information

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables

Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables Smoothly Clipped Absolute Deviation (SCAD) for Correlated Variables LIB-MA, FSSM Cadi Ayyad University (Morocco) COMPSTAT 2010 Paris, August 22-27, 2010 Motivations Fan and Li (2001), Zou and Li (2008)

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation

Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator

More information

Robust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly

Robust Variable Selection Methods for Grouped Data. Kristin Lee Seamon Lilly Robust Variable Selection Methods for Grouped Data by Kristin Lee Seamon Lilly A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree

More information

Institute of Statistics Mimeo Series No Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR

Institute of Statistics Mimeo Series No Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR DEPARTMENT OF STATISTICS North Carolina State University 2501 Founders Drive, Campus Box 8203 Raleigh, NC 27695-8203 Institute of Statistics Mimeo Series No. 2583 Simultaneous regression shrinkage, variable

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

A Significance Test for the Lasso

A Significance Test for the Lasso A Significance Test for the Lasso Lockhart R, Taylor J, Tibshirani R, and Tibshirani R Ashley Petersen May 14, 2013 1 Last time Problem: Many clinical covariates which are important to a certain medical

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Bayesian Linear Models

Bayesian Linear Models Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors

More information

Comparisons of penalized least squares. methods by simulations

Comparisons of penalized least squares. methods by simulations Comparisons of penalized least squares arxiv:1405.1796v1 [stat.co] 8 May 2014 methods by simulations Ke ZHANG, Fan YIN University of Science and Technology of China, Hefei 230026, China Shifeng XIONG Academy

More information

Two Tales of Variable Selection for High Dimensional Regression: Screening and Model Building

Two Tales of Variable Selection for High Dimensional Regression: Screening and Model Building Two Tales of Variable Selection for High Dimensional Regression: Screening and Model Building Cong Liu, Tao Shi and Yoonkyung Lee Department of Statistics, The Ohio State University Abstract Variable selection

More information

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015

MS&E 226. In-Class Midterm Examination Solutions Small Data October 20, 2015 MS&E 226 In-Class Midterm Examination Solutions Small Data October 20, 2015 PROBLEM 1. Alice uses ordinary least squares to fit a linear regression model on a dataset containing outcome data Y and covariates

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

A spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output. Brian Reich, NC State

A spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output. Brian Reich, NC State A spatial causal analysis of wildfire-contributed PM 2.5 using numerical model output Brian Reich, NC State Workshop on Causal Adjustment in the Presence of Spatial Dependence June 11-13, 2018 Brian Reich,

More information

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1 Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

Bayesian spatial quantile regression

Bayesian spatial quantile regression Brian J. Reich and Montserrat Fuentes North Carolina State University and David B. Dunson Duke University E-mail:reich@stat.ncsu.edu Tropospheric ozone Tropospheric ozone has been linked with several adverse

More information

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency

Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Gaussian processes for spatial modelling in environmental health: parameterizing for flexibility vs. computational efficiency Chris Paciorek March 11, 2005 Department of Biostatistics Harvard School of

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

PENALIZED METHOD BASED ON REPRESENTATIVES & NONPARAMETRIC ANALYSIS OF GAP DATA

PENALIZED METHOD BASED ON REPRESENTATIVES & NONPARAMETRIC ANALYSIS OF GAP DATA PENALIZED METHOD BASED ON REPRESENTATIVES & NONPARAMETRIC ANALYSIS OF GAP DATA A Thesis Presented to The Academic Faculty by Soyoun Park In Partial Fulfillment of the Requirements for the Degree Doctor

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

Some Curiosities Arising in Objective Bayesian Analysis

Some Curiosities Arising in Objective Bayesian Analysis . Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work

More information

Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors

Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors The Canadian Journal of Statistics Vol. xx No. yy 0?? Pages?? La revue canadienne de statistique Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors Aixin Tan

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 12 1 / 36 Tip + Paper Tip: As a statistician the results

More information

Nonparametric Bayes tensor factorizations for big data

Nonparametric Bayes tensor factorizations for big data Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional

More information

Day 4: Shrinkage Estimators

Day 4: Shrinkage Estimators Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 7/8 - High-dimensional modeling part 1 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Classification

More information

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28

Sparsity Models. Tong Zhang. Rutgers University. T. Zhang (Rutgers) Sparsity Models 1 / 28 Sparsity Models Tong Zhang Rutgers University T. Zhang (Rutgers) Sparsity Models 1 / 28 Topics Standard sparse regression model algorithms: convex relaxation and greedy algorithm sparse recovery analysis:

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Knockoffs as Post-Selection Inference

Knockoffs as Post-Selection Inference Knockoffs as Post-Selection Inference Lucas Janson Harvard University Department of Statistics blank line blank line WHOA-PSI, August 12, 2017 Controlled Variable Selection Conditional modeling setup:

More information

Accounting for Complex Sample Designs via Mixture Models

Accounting for Complex Sample Designs via Mixture Models Accounting for Complex Sample Designs via Finite Normal Mixture Models 1 1 University of Michigan School of Public Health August 2009 Talk Outline 1 2 Accommodating Sampling Weights in Mixture Models 3

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Grouping Pursuit in regression. Xiaotong Shen

Grouping Pursuit in regression. Xiaotong Shen Grouping Pursuit in regression Xiaotong Shen School of Statistics University of Minnesota Email xshen@stat.umn.edu Joint with Hsin-Cheng Huang (Sinica, Taiwan) Workshop in honor of John Hartigan Innovation

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Ultra High Dimensional Variable Selection with Endogenous Variables

Ultra High Dimensional Variable Selection with Endogenous Variables 1 / 39 Ultra High Dimensional Variable Selection with Endogenous Variables Yuan Liao Princeton University Joint work with Jianqing Fan Job Market Talk January, 2012 2 / 39 Outline 1 Examples of Ultra High

More information

Lecture 24 May 30, 2018

Lecture 24 May 30, 2018 Stats 3C: Theory of Statistics Spring 28 Lecture 24 May 3, 28 Prof. Emmanuel Candes Scribe: Martin J. Zhang, Jun Yan, Can Wang, and E. Candes Outline Agenda: High-dimensional Statistical Estimation. Lasso

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

High-dimensional regression with unknown variance

High-dimensional regression with unknown variance High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f

More information

Prediction & Feature Selection in GLM

Prediction & Feature Selection in GLM Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

More information

Semi-Nonparametric Inferences for Massive Data

Semi-Nonparametric Inferences for Massive Data Semi-Nonparametric Inferences for Massive Data Guang Cheng 1 Department of Statistics Purdue University Statistics Seminar at NCSU October, 2015 1 Acknowledge NSF, Simons Foundation and ONR. A Joint Work

More information

Confounder Adjustment in Multiple Hypothesis Testing

Confounder Adjustment in Multiple Hypothesis Testing in Multiple Hypothesis Testing Department of Statistics, Stanford University January 28, 2016 Slides are available at http://web.stanford.edu/~qyzhao/. Collaborators Jingshu Wang Trevor Hastie Art Owen

More information

Estimation of Complex Small Area Parameters with Application to Poverty Indicators

Estimation of Complex Small Area Parameters with Application to Poverty Indicators 1 Estimation of Complex Small Area Parameters with Application to Poverty Indicators J.N.K. Rao School of Mathematics and Statistics, Carleton University (Joint work with Isabel Molina from Universidad

More information