SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING

Size: px
Start display at page:

Download "SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING"

Transcription

1 Submitted to the Annals of Applied Statistics SUPPLEMENTARY APPENDICES FOR WAVELET-DOMAIN REGRESSION AND PREDICTIVE INFERENCE IN PSYCHIATRIC NEUROIMAGING By Philip T. Reiss, Lan Huo, Yihong Zhao, Clare Kelly and R. Todd Ogden APPENDIX A: SIMULATION DETAILS A.. Comparative simulation study. Given the true coefficient image β = β (k) for k = or 2 as defined in 4, we generated continuous outcomes (A) y i = x T i β + ε i with ε i N(0, σ 2 ), with σ 2 chosen as the solution to (A2) σ 2 s 2 xβ + σ2 = R2, where s 2 xβ is the sample variance of xt β,..., xt n β, representing variance explained by the model, and R 2 is the specified value (0. or 0.5). The left side of (A2) is similar to what Tibshirani and Knight (999) called the theoretical R 2, and has the interpretation that for responses generated according to (A), (A2), the coefficient of determination for the true model is approximately equal to the specified value R 2. An R 2 analogue for logistic regression (Menard, 2000) is given by (A3) R 2 L = log(l M) log(l 0 ), where L M is the likelihood of the given model while L 0 is the likelihood of the model containing only an intercept. For the logistic regression simulation settings we defined a theoretical version of RL 2 analogous to the left side of (A2). Suppose we are simulating responses y i Bernoulli(p i ), i =,..., n, where p i (A4) log = δ 0 + x T i β p i for given δ 0, β, and predictors x i (i =,..., n). Let E( ) denote expectation under the assumed model (A4), and let L... denote likelihood based on the

2 2 P. T. REISS ET AL. true value of parameters given in the subscript(s), with parameters not given in the subscripts set to zero. Then the proposed variant of (A3) for use in simulations is (A5) (A6) RL 2 = Elog(L δ 0,β) Elog(L δ0 ) = log{+exp( δ 0 x T i β)} +exp( δ 0 x T i β) + log{+exp(δ 0+x T i β)} +exp(δ 0 +x T i β). log{+exp( δ 0 )} +exp( δ 0 x T i β) + log{+exp(δ 0)} +exp(δ 0 +x T i β) To perform simulations with a desired value of R 2 L, we take β = sβ 0 in (A6) for a given β 0, and numerically solve for s > 0 such that (A6) equals the specified value. We used 8-fold CV with 8 repetitions (see 3.6), i.e., minimization of (3) or (4) with K = R = 8, to choose from among the following candidate tuning parameter values. For FPCR, the number of basis functions along each dimension was chosen between 20 and 30, and the number of PCs chosen from among the values 20; for given numbers of basis functions and of PCs, the roughness penalty parameter was chosen by restricted maximum likelihood (Reiss and Ogden, 2009; Wood, 20). For the three wavelet methods, the decomposition level parameter j 0 was set to 4. For WPCR and WPLS, we retained c = 200 wavelet coefficients, and again chose from among 20 components. (In our experience, a small number of retained wavelet coefficients can sometimes attain the minimal CV, but at the expense of leading to highly unstable estimates; we therefore chose the moderate value c = 200 rather than varying c for what would likely be a miniscule reduction in CV.) For WNet, the mixing parameter α in () is chosen from among 0., 0.4, 0.7,, and a dense grid of candidate λ values was automatically chosen by the glmnet algorithm (Friedman, Hastie and Tibshirani, 200) for each value of α. A.2. Permutation test simulation study. To test the power of the permutation procedure for model (A), we again generated continuous responses with specified R 2 values in the sense that σ 2 was chosen to satisfy (A2). For the linear model with a scalar covariate (A7) y i = t i δ + x T i β + ε i, we fixed the coefficient of determination Rt 2 = 0.2 for the scalar predictors, and set the partial coefficient of determination Rx t 2 = 0.02, 0.04,..., 0.6

3 WAVELET-DOMAIN IMAGE REGRESSION 3 for the image predictors (see Anderson-Sprecher, 994), by choosing δ, σ 2 in (A7) to satisfy the pair of equations (A8) s2 xβ + σ2 s 2 tδ+xβ + σ2 = R2 t = 0.2, σ 2 s 2 xβ + σ2 = R2 x t, where s 2 tδ+xβ is the sample variance of the t iδ + x T i β (i =,..., n). The second equation in (A8) can be solved directly for σ 2. Substituting this value into the first equation, and noting that s 2 tδ+xβ = δ2 s 2 t + 2δs t,xβ + s 2 xβ (where s 2 t is the sample variance of the t i s and s t,xβ is the sample covariance the t i s and the x T i β s), yields a quadratic equation that can be solved for δ. To evaluate the permutation test for logistic regression, we used the same set of R 2 values for the case without scalar covariates, and the same Rt 2 and Rx t 2 values for the case with scalar covariates. As in A., this required defining Rt 2 and Rx t 2 (partial R2 ) for simulating from a logistic regression model, in this case (A9) log p i p i = δ 0 + t i δ + x T i β. With analogous notation to (A5), we define and Rt 2 = Elog(L δ 0,δ ) Elog(L δ0 ) = Rx t 2 = Elog(L δ 0,δ,β) Elog(L δ0,δ ) (A0) = log{+exp( δ 0 t i δ )} +exp( δ 0 t i δ x T i β) + log{+exp(δ 0+t i δ )} log{+exp( δ 0 )} +exp( δ 0 t i δ x T i β) + log{+exp(δ 0)} log{+exp( δ 0 t i δ x T i β)} +exp( δ 0 t i δ x T i β) + log{+exp(δ 0+t i δ +x T i β)}. log{+exp( δ 0 t i δ )} +exp( δ 0 t i δ x T i β) + log{+exp(δ 0+t i δ )} Given (t i, x i ) (i =,..., n), δ 0, and β 0 such that β = sβ 0 for some s, attaining specified values Rt 2 of Rx t 2 reduces to solving the above two

4 4 P. T. REISS ET AL. equations for δ and s. Assuming the x i s have mean zero, we can simplify the problem via the approximation n log{+exp( δ0 t i δ )} R 2 t +exp( δ 0 t i δ ) + log{+exp(δ 0+t i δ )} +exp(δ 0 +t i δ ) log{+exp( δ0 )} +exp( δ 0 t i δ ) + log{+exp(δ 0)} +exp(δ 0 +t i δ ). We treat this as an equality and solve it for δ, then insert the result into (A0) and solve for s. APPENDIX B: PERMUTATION OF RESIDUALS The original permutation of regressor residuals (PRR) procedure of Potter (2005) differs somewhat from what we propose in 5. of the main text. The PRR procedure (adapted slightly to the image-predictor context) uses design matrix (A) T Π(I P T )X rather than T P T X + Π(I P T )X as in (5); in other words, it would simply replace the X portion of the design matrix with the permuted residuals instead of adding back the permuted residuals to P T X. For the unpenalized model considered by Potter (2005) (see also section of Ridgway, 2009), the simpler design matrix (A) is equivalent to (5). But for penalized models such as the wavelet-domain elastic net, the two design matrices tend to produce slightly different results. We therefore prefer the permuted-data design matrix (5), which preserves the original data s dependence between scalar and image predictors. In a different neuroimaging setting, Winkler et al. (204) show that PRR (which they refer to as the Smith procedure ) compares favorably to other permutation test procedures for linear models with nuisance predictors. APPENDIX C: LINEAR REGRESSION POWER SIMULATION RESULTS Here we report linear regression simulation results for the permutation test procedure (see 5. for logistic regression results). We first considered the case without scalar covariates, and generated responses (A2) y i = x T i β + ε i with ε i N(0, σ 2 ), i =,..., n = 333, where x i R 642 is the ith image (expressed as a vector), β is the true coefficient image shown in Figure A(a) (similarly vectorized),

5 WAVELET-DOMAIN IMAGE REGRESSION 5 (a) (b) (c) prop. p prop. p R R 2 Fig A. (a) True coefficient image β used in the power study: gray denotes 0, black denotes. (b) Estimated probability of rejecting the null hypothesis β = 0 as a function of R 2, with 95% confidence intervals, for model (A2). (c) Same, for model (A3). and σ 2 is chosen to attain approximate R 2 values as in Supplementary Appendix A. We simulated 200 response vectors to assess power to reject H 0 : β = 0 at the p =.05 level for each of the R 2 values.04,.07,.,.5,.2,.25,.3, as well as 000 response vectors with β = 0 (R 2 = 0) to assess type-i error rate. Next we considered testing the same null hypothesis for the model (A3) y i = t i δ + x T i β + ε i with ε i N(0, σ 2 ), with a scalar covariate t i such that R 2 for the submodel E(y i t i ) = t i δ is approximately 0.2. We generated the same number of response vectors as above for each of the above R 2 values, but here R 2 refers to the partial R 2 adjusting for t i (see Supplementary Appendix A.2). The results, displayed in Figure A(b) and (c), indicate that the nominal type-i error rate is approximately attained for both models, and the power exceeds 90% when R 2 is at least.5 for either model (A2) or model (A3). APPENDIX D: SELECTING A SUBSAMPLE OF THE ADHD-200 DATA SET Of the 776 individuals in the ADHD-200 training sample, we considered only the 450 individuals who were right-handed and were either typically developing controls (340) or diagnosed with combined-type ADHD (0), the subtype expected to be most readily distinguishable from controls. Head motion artifacts have recently emerged as a major concern in the resting-state fmri literature (e.g. Van Dijk, Sabuncu and Buckner, 202). Since there is no consensus as yet on how to address this issue, we chose to sacrifice a considerable amount of data in order to minimize the risk of spurious findings due to motion artifacts. We excluded those subjects whose

6 6 P. T. REISS ET AL. mean framewise displacement (FD) (Power et al., 202), a motion score, exceeded We then matched the control and ADHD groups on mean FD by dividing the sample into mean FD deciles, and then randomly subsampling either controls or ADHD individuals within each decile to attain roughly equal control-to-adhd ratios for each decile. This reduced the number of subjects to 333 (257 controls and 76 with combined-type ADHD; 98 males, 35 females; age range ). The falff data were processed and made available by the Neuro Bureau via the NITRC repository; the data and full details of the image processing steps are available at php/neurobureau:athenapipeline. Nonzero falff values were recorded only for voxels within the brain, but due to inter-subject differences in scan volume coverage, the set of brain voxels varied somewhat among subjects. Our analysis included the 92 voxels located within the brain for all 333 subjects. APPENDIX E: FROM 2D TO 3D PREDICTORS We fitted model (8) by the wavelet-domain elastic net, using the same set of 333 individuals as in 6, but with 3D maps a set of voxels from the falff maps rather than with 2D slices. Here, as in 4, we retained sufficiently many wavelet coefficients to capture 99.5% of the excess variance. We were particularly interested in whether the relative performance of lower vs. higher values of α (less sparse vs. more sparse fits) differed when 3D rather than 2D images were used. Figure A2 shows the observed CV deviance when we used, 6, or all 32 of the axial slices. For slice, the lowest CV score is attained with α =, i.e., the lasso. But for 6 or 32 slices, less sparse models, in particular α = 0., are favored. This suggests that as the number of voxels grows, choosing a sparse coefficient image incurs a higher cost in terms of predictive accuracy. REFERENCES Anderson-Sprecher, R. (994). Model comparisons and R 2. The American Statistician Friedman, J., Hastie, T. and Tibshirani, R. (200). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis. American Statistician Potter, D. M. (2005). A permutation test for inference in logistic regression with smalland moderate-sized data sets. Statistics in Medicine Power, J. D., Barnes, K. A., Snyder, A. Z., Schlaggar, B. L. and Petersen, S. E. (202). Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage

7 WAVELET-DOMAIN IMAGE REGRESSION 7 slice 6 slices 32 slices CV Deviance α CV Deviance α CV Deviance α λ λ λ Fig A2. CV deviance for the wavelet-domain elastic net fitted to our subsample of the ADHD-200 data set using a set of voxels from the falff images. Reiss, P. T. and Ogden, R. T. (2009). Smoothing parameter selection for a class of semiparametric linear models. Journal of the Royal Statistical Society: Series B Ridgway, G. R. (2009). Statistical analysis for longitudinal MR imaging of dementia PhD thesis, University College London. Tibshirani, R. and Knight, K. (999). The covariance inflation criterion for adaptive model selection. Journal of the Royal Statistical Society, Series B Van Dijk, K. R. A., Sabuncu, M. R. and Buckner, R. L. (202). The influence of head motion on intrinsic functional connectivity MRI. NeuroImage Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M. and Nichols, T. E. (204). Permutation inference for the general linear model. NeuroImage Wood, S. N. (20). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society: Series B Department of Child and Adolescent Psychiatry New York University School of Medicine Park Ave., 7th floor New York, NY phil.reiss@nyumc.org lan.huo@nyumc.org yihong.zhao@nyumc.org amclarekelly@gmail.com Department of Biostatistics Columbia University 722 W. 68th St., 6th floor New York, NY to66@columbia.edu

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression

Simultaneous Confidence Bands for the Coefficient Function in Functional Regression University of Haifa From the SelectedWorks of Philip T. Reiss August 7, 2008 Simultaneous Confidence Bands for the Coefficient Function in Functional Regression Philip T. Reiss, New York University Available

More information

Regression When the Predictors Are Images

Regression When the Predictors Are Images University of Haifa From the SelectedWorks of Philip T. Reiss May, 2009 Regression When the Predictors Are Images Philip T. Reiss, New York University Available at: https://works.bepress.com/phil_reiss/15/

More information

Function-on-Scalar Regression with the refund Package

Function-on-Scalar Regression with the refund Package University of Haifa From the SelectedWorks of Philip T. Reiss July 30, 2012 Function-on-Scalar Regression with the refund Package Philip T. Reiss, New York University Available at: https://works.bepress.com/phil_reiss/28/

More information

Semiparametric Methods for Mapping Brain Development

Semiparametric Methods for Mapping Brain Development University of Haifa From the SelectedWorks of Philip T. Reiss May, 2012 Semiparametric Methods for Mapping Brain Development Philip T. Reiss Yin-Hsiu Chen Lan Huo Available at: https://works.bepress.com/phil_reiss/27/

More information

ESL Chap3. Some extensions of lasso

ESL Chap3. Some extensions of lasso ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied

More information

Multinomial functional regression with application to lameness detection for horses

Multinomial functional regression with application to lameness detection for horses Department of Mathematical Sciences Multinomial functional regression with application to lameness detection for horses Helle Sørensen (helle@math.ku.dk) Joint with Seyed Nourollah Mousavi user! 2015,

More information

Towards a Regression using Tensors

Towards a Regression using Tensors February 27, 2014 Outline Background 1 Background Linear Regression Tensorial Data Analysis 2 Definition Tensor Operation Tensor Decomposition 3 Model Attention Deficit Hyperactivity Disorder Data Analysis

More information

Prediction & Feature Selection in GLM

Prediction & Feature Selection in GLM Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Hands-on Data Analysis

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

MS-C1620 Statistical inference

MS-C1620 Statistical inference MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents

More information

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty

Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Journal of Data Science 9(2011), 549-564 Selection of Smoothing Parameter for One-Step Sparse Estimates with L q Penalty Masaru Kanba and Kanta Naito Shimane University Abstract: This paper discusses the

More information

Regularization in Cox Frailty Models

Regularization in Cox Frailty Models Regularization in Cox Frailty Models Andreas Groll 1, Trevor Hastie 2, Gerhard Tutz 3 1 Ludwig-Maximilians-Universität Munich, Department of Mathematics, Theresienstraße 39, 80333 Munich, Germany 2 University

More information

Exploratory quantile regression with many covariates: An application to adverse birth outcomes

Exploratory quantile regression with many covariates: An application to adverse birth outcomes Exploratory quantile regression with many covariates: An application to adverse birth outcomes June 3, 2011 eappendix 30 Percent of Total 20 10 0 0 1000 2000 3000 4000 5000 Birth weights efigure 1: Histogram

More information

Inference After Variable Selection

Inference After Variable Selection Department of Mathematics, SIU Carbondale Inference After Variable Selection Lasanthi Pelawa Watagoda lasanthi@siu.edu June 12, 2017 Outline 1 Introduction 2 Inference For Ridge and Lasso 3 Variable Selection

More information

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013)

A Survey of L 1. Regression. Céline Cunen, 20/10/2014. Vidaurre, Bielza and Larranaga (2013) A Survey of L 1 Regression Vidaurre, Bielza and Larranaga (2013) Céline Cunen, 20/10/2014 Outline of article 1.Introduction 2.The Lasso for Linear Regression a) Notation and Main Concepts b) Statistical

More information

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression

A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent

More information

Recap from previous lecture

Recap from previous lecture Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience

More information

Analysis Methods for Supersaturated Design: Some Comparisons

Analysis Methods for Supersaturated Design: Some Comparisons Journal of Data Science 1(2003), 249-260 Analysis Methods for Supersaturated Design: Some Comparisons Runze Li 1 and Dennis K. J. Lin 2 The Pennsylvania State University Abstract: Supersaturated designs

More information

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010

Chris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010 Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,

More information

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection

TECHNICAL REPORT NO. 1091r. A Note on the Lasso and Related Procedures in Model Selection DEPARTMENT OF STATISTICS University of Wisconsin 1210 West Dayton St. Madison, WI 53706 TECHNICAL REPORT NO. 1091r April 2004, Revised December 2004 A Note on the Lasso and Related Procedures in Model

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models Jingyi Jessica Li Department of Statistics University of California, Los

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation

Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Multivariate Regression Generalized Likelihood Ratio Tests for FMRI Activation Daniel B Rowe Division of Biostatistics Medical College of Wisconsin Technical Report 40 November 00 Division of Biostatistics

More information

Generalized Elastic Net Regression

Generalized Elastic Net Regression Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

A simulation study of model fitting to high dimensional data using penalized logistic regression

A simulation study of model fitting to high dimensional data using penalized logistic regression A simulation study of model fitting to high dimensional data using penalized logistic regression Ellinor Krona Kandidatuppsats i matematisk statistik Bachelor Thesis in Mathematical Statistics Kandidatuppsats

More information

Regression Shrinkage and Selection via the Lasso

Regression Shrinkage and Selection via the Lasso Regression Shrinkage and Selection via the Lasso ROBERT TIBSHIRANI, 1996 Presenter: Guiyun Feng April 27 () 1 / 20 Motivation Estimation in Linear Models: y = β T x + ɛ. data (x i, y i ), i = 1, 2,...,

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information

Package Grace. R topics documented: April 9, Type Package

Package Grace. R topics documented: April 9, Type Package Type Package Package Grace April 9, 2017 Title Graph-Constrained Estimation and Hypothesis Tests Version 0.5.3 Date 2017-4-8 Author Sen Zhao Maintainer Sen Zhao Description Use

More information

Penalized likelihood logistic regression with rare events

Penalized likelihood logistic regression with rare events Penalized likelihood logistic regression with rare events Georg Heinze 1, Angelika Geroldinger 1, Rainer Puhr 2, Mariana Nold 3, Lara Lusa 4 1 Medical University of Vienna, CeMSIIS, Section for Clinical

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Fast Regularization Paths via Coordinate Descent

Fast Regularization Paths via Coordinate Descent August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor

More information

Univariate shrinkage in the Cox model for high dimensional data

Univariate shrinkage in the Cox model for high dimensional data Univariate shrinkage in the Cox model for high dimensional data Robert Tibshirani January 6, 2009 Abstract We propose a method for prediction in Cox s proportional model, when the number of features (regressors)

More information

High-dimensional regression modeling

High-dimensional regression modeling High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making

More information

Regularization Path Algorithms for Detecting Gene Interactions

Regularization Path Algorithms for Detecting Gene Interactions Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable

More information

A Confidence Region Approach to Tuning for Variable Selection

A Confidence Region Approach to Tuning for Variable Selection A Confidence Region Approach to Tuning for Variable Selection Funda Gunes and Howard D. Bondell Department of Statistics North Carolina State University Abstract We develop an approach to tuning of penalized

More information

Regularization: Ridge Regression and the LASSO

Regularization: Ridge Regression and the LASSO Agenda Wednesday, November 29, 2006 Agenda Agenda 1 The Bias-Variance Tradeoff 2 Ridge Regression Solution to the l 2 problem Data Augmentation Approach Bayesian Interpretation The SVD and Ridge Regression

More information

Tuning Parameter Selection in L1 Regularized Logistic Regression

Tuning Parameter Selection in L1 Regularized Logistic Regression Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2012 Tuning Parameter Selection in L1 Regularized Logistic Regression Shujing Shi Virginia Commonwealth University

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 9: Basis Expansions Department of Statistics & Biostatistics Rutgers University Nov 01, 2011 Regression and Classification Linear Regression. E(Y X) = f(x) We want to learn

More information

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables

Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables Chapter 26 Multiple Regression, Logistic Regression, and Indicator Variables 26.1 S 4 /IEE Application Examples: Multiple Regression An S 4 /IEE project was created to improve the 30,000-footlevel metric

More information

Checking model assumptions with regression diagnostics

Checking model assumptions with regression diagnostics @graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor

More information

LASSO Review, Fused LASSO, Parallel LASSO Solvers

LASSO Review, Fused LASSO, Parallel LASSO Solvers Case Study 3: fmri Prediction LASSO Review, Fused LASSO, Parallel LASSO Solvers Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 3, 2016 Sham Kakade 2016 1 Variable

More information

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood

More information

The lasso, persistence, and cross-validation

The lasso, persistence, and cross-validation The lasso, persistence, and cross-validation Daniel J. McDonald Department of Statistics Indiana University http://www.stat.cmu.edu/ danielmc Joint work with: Darren Homrighausen Colorado State University

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 .0.0 5 5 1.0 7 5 X2 X2 7 1.5 1.0 0.5 3 1 2 Hierarchical clustering

More information

Neuroimaging for Machine Learners Validation and inference

Neuroimaging for Machine Learners Validation and inference GIGA in silico medicine, ULg, Belgium http://www.giga.ulg.ac.be Neuroimaging for Machine Learners Validation and inference Christophe Phillips, Ir. PhD. PRoNTo course June 2017 Univariate analysis: Introduction:

More information

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Final Review. Yang Feng.   Yang Feng (Columbia University) Final Review 1 / 58 Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple

More information

Discussion of Least Angle Regression

Discussion of Least Angle Regression Discussion of Least Angle Regression David Madigan Rutgers University & Avaya Labs Research Piscataway, NJ 08855 madigan@stat.rutgers.edu Greg Ridgeway RAND Statistics Group Santa Monica, CA 90407-2138

More information

Statistical Inference

Statistical Inference Statistical Inference Jean Daunizeau Wellcome rust Centre for Neuroimaging University College London SPM Course Edinburgh, April 2010 Image time-series Spatial filter Design matrix Statistical Parametric

More information

Shrinkage Methods: Ridge and Lasso

Shrinkage Methods: Ridge and Lasso Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and

More information

Statistics 262: Intermediate Biostatistics Model selection

Statistics 262: Intermediate Biostatistics Model selection Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.

More information

The General Linear Model (GLM)

The General Linear Model (GLM) he General Linear Model (GLM) Klaas Enno Stephan ranslational Neuromodeling Unit (NU) Institute for Biomedical Engineering University of Zurich & EH Zurich Wellcome rust Centre for Neuroimaging Institute

More information

Extracting fmri features

Extracting fmri features Extracting fmri features PRoNTo course May 2018 Christophe Phillips, GIGA Institute, ULiège, Belgium c.phillips@uliege.be - http://www.giga.ulg.ac.be Overview Introduction Brain decoding problem Subject

More information

Proteomics and Variable Selection

Proteomics and Variable Selection Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Non-parametric Inference, Polynomial Regression Week 9, Lecture 2 1 Bootstrapped Bias and CIs Given a multiple regression model with mean and

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-3.2 STATS 202: Data mining and analysis October 4, 2017 1 / 22 Hierarchical clustering Most algorithms for hierarchical clustering

More information

The MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010

The MNet Estimator. Patrick Breheny. Department of Biostatistics Department of Statistics University of Kentucky. August 2, 2010 Department of Biostatistics Department of Statistics University of Kentucky August 2, 2010 Joint work with Jian Huang, Shuangge Ma, and Cun-Hui Zhang Penalized regression methods Penalized methods have

More information

Prediction Intervals For Lasso and Relaxed Lasso Using D Variables

Prediction Intervals For Lasso and Relaxed Lasso Using D Variables Southern Illinois University Carbondale OpenSIUC Research Papers Graduate School 2017 Prediction Intervals For Lasso and Relaxed Lasso Using D Variables Craig J. Bartelsmeyer Southern Illinois University

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape

LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape LASSO-Type Penalization in the Framework of Generalized Additive Models for Location, Scale and Shape Nikolaus Umlauf https://eeecon.uibk.ac.at/~umlauf/ Overview Joint work with Andreas Groll, Julien Hambuckers

More information

,..., θ(2),..., θ(n)

,..., θ(2),..., θ(n) Likelihoods for Multivariate Binary Data Log-Linear Model We have 2 n 1 distinct probabilities, but we wish to consider formulations that allow more parsimonious descriptions as a function of covariates.

More information

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty

More information

PENALIZING YOUR MODELS

PENALIZING YOUR MODELS PENALIZING YOUR MODELS AN OVERVIEW OF THE GENERALIZED REGRESSION PLATFORM Michael Crotty & Clay Barker Research Statisticians JMP Division, SAS Institute Copyr i g ht 2012, SAS Ins titut e Inc. All rights

More information

Robust Bayesian Variable Selection for Modeling Mean Medical Costs

Robust Bayesian Variable Selection for Modeling Mean Medical Costs Robust Bayesian Variable Selection for Modeling Mean Medical Costs Grace Yoon 1,, Wenxin Jiang 2, Lei Liu 3 and Ya-Chen T. Shih 4 1 Department of Statistics, Texas A&M University 2 Department of Statistics,

More information

Stability and the elastic net

Stability and the elastic net Stability and the elastic net Patrick Breheny March 28 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/32 Introduction Elastic Net Our last several lectures have concentrated on methods for

More information

On High-Dimensional Cross-Validation

On High-Dimensional Cross-Validation On High-Dimensional Cross-Validation BY WEI-CHENG HSIAO Institute of Statistical Science, Academia Sinica, 128 Academia Road, Section 2, Nankang, Taipei 11529, Taiwan hsiaowc@stat.sinica.edu.tw 5 WEI-YING

More information

Statistical Learning with the Lasso, spring The Lasso

Statistical Learning with the Lasso, spring The Lasso Statistical Learning with the Lasso, spring 2017 1 Yeast: understanding basic life functions p=11,904 gene values n number of experiments ~ 10 Blomberg et al. 2003, 2010 The Lasso fmri brain scans function

More information

Uncertainty quantification and visualization for functional random variables

Uncertainty quantification and visualization for functional random variables Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance,

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Group analysis. Jean Daunizeau Wellcome Trust Centre for Neuroimaging University College London. SPM Course Edinburgh, April 2010

Group analysis. Jean Daunizeau Wellcome Trust Centre for Neuroimaging University College London. SPM Course Edinburgh, April 2010 Group analysis Jean Daunizeau Wellcome Trust Centre for Neuroimaging University College London SPM Course Edinburgh, April 2010 Image time-series Spatial filter Design matrix Statistical Parametric Map

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

Regularization and Variable Selection via the Elastic Net

Regularization and Variable Selection via the Elastic Net p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction

More information

High-dimensional Ordinary Least-squares Projection for Screening Variables

High-dimensional Ordinary Least-squares Projection for Screening Variables 1 / 38 High-dimensional Ordinary Least-squares Projection for Screening Variables Chenlei Leng Joint with Xiangyu Wang (Duke) Conference on Nonparametric Statistics for Big Data and Celebration to Honor

More information

6. Regularized linear regression

6. Regularized linear regression Foundations of Machine Learning École Centrale Paris Fall 2015 6. Regularized linear regression Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

STATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours

STATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours Instructions: STATS216v Introduction to Statistical Learning Stanford University, Summer 2017 Remember the university honor code. Midterm Exam (Solutions) Duration: 1 hours Write your name and SUNet ID

More information

2. Regression Review

2. Regression Review 2. Regression Review 2.1 The Regression Model The general form of the regression model y t = f(x t, β) + ε t where x t = (x t1,, x tp ), β = (β 1,..., β m ). ε t is a random variable, Eε t = 0, Var(ε t

More information

Statistical Inference

Statistical Inference Statistical Inference Liu Yang Florida State University October 27, 2016 Liu Yang, Libo Wang (Florida State University) Statistical Inference October 27, 2016 1 / 27 Outline The Bayesian Lasso Trevor Park

More information

SUPPLEMENTARY SIMULATIONS & FIGURES

SUPPLEMENTARY SIMULATIONS & FIGURES Supplementary Material: Supplementary Material for Mixed Effects Models for Resampled Network Statistics Improve Statistical Power to Find Differences in Multi-Subject Functional Connectivity Manjari Narayan,

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Applied Statistics Friday, January 15, 2016 Work all problems. 60 points are needed to pass at the Masters Level and 75 to pass at the

More information

Resampling-Based Information Criteria for Adaptive Linear Model Selection

Resampling-Based Information Criteria for Adaptive Linear Model Selection Resampling-Based Information Criteria for Adaptive Linear Model Selection Phil Reiss January 5, 2010 Joint work with Joe Cavanaugh, Lei Huang, and Amy Krain Roy Outline Motivating application: amygdala

More information

MSA220/MVE440 Statistical Learning for Big Data

MSA220/MVE440 Statistical Learning for Big Data MSA220/MVE440 Statistical Learning for Big Data Lecture 9-10 - High-dimensional regression Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Recap from

More information

Sufficient Dimension Reduction using Support Vector Machine and it s variants

Sufficient Dimension Reduction using Support Vector Machine and it s variants Sufficient Dimension Reduction using Support Vector Machine and it s variants Andreas Artemiou School of Mathematics, Cardiff University @AG DANK/BCS Meeting 2013 SDR PSVM Real Data Current Research and

More information

Smooth Scalar-on-Image Regression via. Spatial Bayesian Variable Selection

Smooth Scalar-on-Image Regression via. Spatial Bayesian Variable Selection Smooth Scalar-on-Image Regression via Spatial Bayesian Variable Selection Jeff Goldsmith 1,*, Lei Huang 2, and Ciprian M. Crainiceanu 2 1 Department of Biostatistics, Columbia University School of Public

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

Package covtest. R topics documented:

Package covtest. R topics documented: Package covtest February 19, 2015 Title Computes covariance test for adaptive linear modelling Version 1.02 Depends lars,glmnet,glmpath (>= 0.97),MASS Author Richard Lockhart, Jon Taylor, Ryan Tibshirani,

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

Focused fine-tuning of ridge regression

Focused fine-tuning of ridge regression Focused fine-tuning of ridge regression Kristoffer Hellton Department of Mathematics, University of Oslo May 9, 2016 K. Hellton (UiO) Focused tuning May 9, 2016 1 / 22 Penalized regression The least-squares

More information

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence

More information

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Multilevel Models in Matrix Form Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Today s Lecture Linear models from a matrix perspective An example of how to do

More information

Lecture 5: Clustering, Linear Regression

Lecture 5: Clustering, Linear Regression Lecture 5: Clustering, Linear Regression Reading: Chapter 10, Sections 3.1-2 STATS 202: Data mining and analysis Sergio Bacallado September 19, 2018 1 / 23 Announcements Starting next week, Julia Fukuyama

More information

Piotr Majer Risk Patterns and Correlated Brain Activities

Piotr Majer Risk Patterns and Correlated Brain Activities Alena My²i ková Piotr Majer Song Song Alena Myšičková Peter N. C. Mohr Peter N. C. Mohr Wolfgang K. Härdle Song Song Hauke R. Heekeren Wolfgang K. Härdle Hauke R. Heekeren C.A.S.E. Centre C.A.S.E. for

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information

Chapter 13. Multiple Regression and Model Building

Chapter 13. Multiple Regression and Model Building Chapter 13 Multiple Regression and Model Building Multiple Regression Models The General Multiple Regression Model y x x x 0 1 1 2 2... k k y is the dependent variable x, x,..., x 1 2 k the model are the

More information

arxiv: v3 [stat.ml] 14 Apr 2016

arxiv: v3 [stat.ml] 14 Apr 2016 arxiv:1307.0048v3 [stat.ml] 14 Apr 2016 Simple one-pass algorithm for penalized linear regression with cross-validation on MapReduce Kun Yang April 15, 2016 Abstract In this paper, we propose a one-pass

More information