Graph Wavelets to Analyze Genomic Data with Biological Networks
|
|
- Phyllis Moore
- 5 years ago
- Views:
Transcription
1 Graph Wavelets to Analyze Genomic Data with Biological Networks Yunlong Jiao and Jean-Philippe Vert "Emerging Topics in Biological Networks and Systems Biology" symposium, Swedish Collegium for Advanced Study, Uppsala, October 11, 2017
2
3 Motivation
4 Typical problem X R n p gene expression profile of each patient Y Y n survival information of each patient n = p = Goal: learn to predict Y from X Difficult (n < p)
5 Regularized linear models Fit a linear model β R p by solving where min R(Y, Xβ) + λj(β), β Rp R(Y, Xβ) is an empirical risk to measures the fit to the training data J(β) is a penalty to control the complexity of the model λ > 0 is a regularization parameter
6 Standard regularizations min R(Y, Xβ) + λj(β) β Rp where Lasso: J(β) = β 1 for gene selection. Ridge: J(β) = β 2 2 to address n m. Elastic net: J(β) = α β (1 α) β 1
7 Network-based regularizations G = (V, E) a graph of genes J G (β) =? β should be "smooth" on the graph? Selected genes should be connected?
8 Examples J G (β) = i j (β i β j ) 2 (Rapaport et al., 2007) J G (β) = a β 1 + (1 a) i j J G (β) = (β i β j ) 2 (Li and Li, 2008) sup α R p : i j α 2 i +α2 j 1 α β (Jacob et al., 2009) J G (β) = a β 1 + (1 a) i j β i β j (Hoefling, 2010)
9 Examples J G (β) = i j (β i β j ) 2 (Rapaport et al., 2007) J G (β) = a β 1 + (1 a) i j J G (β) = (β i β j ) 2 (Li and Li, 2008) sup α R p : i j α 2 i +α2 j 1 α β (Jacob et al., 2009) J G (β) = a β 1 + (1 a) i j β i β j (Hoefling, 2010)
10 From smoothness penalty to Laplacian (β i β j ) 2 = β Lβ i j where L = D A is the graph Laplacian L =
11 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 0 Lambda =
12 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 0.12 Lambda =
13 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 0.47 Lambda =
14 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 1 Lambda =
15 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 1.7 Lambda =
16 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 2.3 Lambda =
17 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 3 Lambda =
18 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 3.5 Lambda =
19 From Laplacian to Fourier analysis: L = UΛU Eigenvectors U of L form the Fourier basis: ˆβ = U β Eigenvalues Λ = (0 = λ 1... λ p ) represent the "frequencies" of the Fourier basis lambda = 3.9 Lambda =
20 Smoothness in the Fourier domain Therefore, the smoothness penalty penalizes Fourier coefficients corresponding to high frequencies: J(β) = i j (β i β j ) 2 = β Lβ = β UΛU β = ˆβ Λ ˆβ = p λ i ˆβ i 2 i=1 "the linear model mapped on the graph should have little energy at high frequency" Rapaport et al. (2007) extends this to more general penalties: J φ (β) = p φ(λ i ) ˆβ i 2 s.t. β = U ˆβ i=1 for φ : R + R + non-decreasing.
21 Fourier vs wavelets Fourier Wavelets Localized in frequency Localized in frequency AND space
22 Wavelets on graphs A family of vectors {Ψ v,s } R p where v [1, p] is a vertex (space) s R + is a scale (frequency) In practice we choose a small number of scales s 1 <... < s S This results in p S vectors (overcomplete basis) continuous-wavelet-transform-(cwt).html
23 Example on graphs 146 D.K. Hammond et al. / Appl. Comput. Harmon. Anal. 30 (2011) Fig. 4. Spectral graph wavelets on Minnesota road graph, with K = 100, J = 4 scales. (a) Vertex at which wavelets are centered, (b) scaling function, (c) (f) wavelets, scales 1 4.
24 Example on graphs D.K. Hammond et al. / Appl. Comput. Harmon. Anal. 30 (2011) Fig. 5. Spectral graph wavelets on cerebral cortex, with K = 50, J = 4scales.(a)ROIatwhichwaveletsarecentered,(b)scalingfunction,(c) (f)wavelets, scales 1 4. (Hammond et al., 2011)
25 How to make the wavelet basis {Ψ v,t }? Hammond et al. (2011) propose spectral graph wavelets Formally, at scale s > 0, Ψ s = (Ψ 1,s... Ψ p,s ) = Ug(sΛ)U where g : R + R + is a function that satisfies g is a band-pass filter (localization in frequency) g is smooth D.K. Hammond nearet 0 al. (this / Appl. ensures Comput. Harmon. localization Anal. 30 (2011) in space)
26 Graph wavelet-based regularization Given a graph, compute a redundant set of S p wavelet basis at different scales: B = (Ψ s1... Ψ ss ) Take for penalty the atomic norm: S p S p J(β) = min c i : β = c i B i i=1 i=1 J(β) is small when β is a sum of a few atoms The atom Ψ s,v has weights in a neighborhood of v of "size" s
27 Summary: Fourier vs. wavelet penalty Fourier: β will be smooth on the graph min R(Y, Xβ) + λj(β) β Rp J φ (β) = φ(λ) ˆβ 2 2 s.t. β = U ˆβ Wavelet J(β) = min { c 1 : β = Bc} β will decompose as a sum of localized functions on the graph (pathways?)
28 Experiment Protein-protein interaction (PPI) network obtained from Human Protein Reference Database (HPRD). METABRIC breast cancer dataset - n = 1, 981 breast cancer samples paired with survival information of patients. - Expression data of a total of 24, 771 genes available, among which p = 9, 117 genes are found with known interaction in HPRD. Benchmark study comparing 6 penalty functions: Label Penalty function J(β) Network-based Gene selection ridge β 2 2 lasso β 1 e-net a β 1 + (1 a) β 2 2 i j (β i β j ) 2 lap laplasso a β 1 + (1 a) i j (β i β j ) 2 wavelet min θ θ 1 s.t. β = Ψθ
29 Prediction performance concordance index label ridge lasso e net lap laplasso wavelet label Boxplots on survival risk prediction performance evaluated by concordance index scores over 5-fold cross-validation repeated 10 times of the METABRIC data.
30 Prediction performance Label Mean CI scores (± SD) Network-based Gene selection ridge (±0.018) lap (±0.0193) laplasso (±0.0185) e-net (±0.0183) wavelet (±0.0198) lasso (±0.0177) Mean concordance index (CI) scores (± standard deviation) of survival risk prediction over 5-fold cross-validation repeated 10 times of the METABRIC data. Methods are ordered by decreasing mean CI scores.
31 Gene selection performance: Stability number of commonly selected genes label lasso e net laplasso wavelet number of selected genes Stability performance of gene selection related to breast cancer survival, estimated over 100 random experiments. The black dotted curve denotes random selection.
32 Gene selection performance: Connectivity number of connecting edges label lasso e net laplasso wavelet number of selected genes Connectivity performance of gene selection related to breast cancer survival, where special marks correspond to the number tuned by cross-validation. The black dotted curve denotes random selection.
33 Gene selection performance: Interpretability Figure: Gene subnetworks related to breast cancer survival identified by regularization methods identified by the elastic net (10 genes connected out of 112 selected) or the Laplacian lasso (10 genes connected out of 100 selected).
34 Gene selection performance: Interpretability Figure: Gene subnetworks related to breast cancer survival identified by regularization methods identified by network-based wavelet smoothing (82 genes connected out of 109 selected).
35 Conclusion Can biological networks help define a structure on high-dimensional omics data? Fourier-based penalties (smoothness, diffusion...) already exist Wavelets-based penalties decompose a signal over a basis localized in space localized in frequency Preliminary results on gene expression classification
36 internet&de&l ANRT!(rubrique!"zoom!sur")! Promotion!aux!Rencontres!Universités!Entreprises!(RUE)! Echange!avec!Stéphane!Mallat!et!réflexion!sur!la!pertinence!du! programme!tronc!commun! Rencontre!avec!le!responsable!des!relations!internationales!de!la! National!Taiwan!University!à!l ESPCI:!présentation!d ITI! Contact!en!cours!pour!visite!d entreprise! Article&les&Echos& Création&du&logo&PSL/ITI&et&du&Diplôme&Supérieur&de&Recherche&et& d Innovation&de&PSL/ITI&(DSRI),&dépôt&INPI&en&cours.& Thanks!! 2/!MISE!EN!ŒUVRE!OPERATIONNELLE!DU!PROGRAMME! Point&d étape&iti&/&20&fevrier& &1er&JUILLET&2014&! C.SURIAM!!F.LEQUEUX!!!!
37 References D. K. Hammond, P. Vandergheynst, and R. Gribonval. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2): , ISSN doi: /j.acha H. Hoefling. A path algorithm for the Fused Lasso Signal Approximator. J. Comput. Graph. Stat., 19(4): , doi: /jcgs URL L. Jacob, G. Obozinski, and J.-P. Vert. Group lasso with overlap and graph lasso. In ICML 09: Proceedings of the 26th Annual International Conference on Machine Learning, pages , New York, NY, USA, ACM. ISBN doi: / URL C. Li and H. Li. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics, 24: , May ISSN doi: /bioinformatics/btn081. F. Rapaport, A. Zinovyev, M. Dutreix, E. Barillot, and J.-P. Vert. Classification of microarray data using gene networks. BMC Bioinformatics, 8:35, doi: / URL R. Tibshirani. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B, 58(1): , URL
6. Regularized linear regression
Foundations of Machine Learning École Centrale Paris Fall 2015 6. Regularized linear regression Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr
More informationStructured Sparse Estimation with Network Flow Optimization
Structured Sparse Estimation with Network Flow Optimization Julien Mairal University of California, Berkeley Neyman seminar, Berkeley Julien Mairal Neyman seminar, UC Berkeley /48 Purpose of the talk introduce
More informationRegularization and Variable Selection via the Elastic Net
p. 1/1 Regularization and Variable Selection via the Elastic Net Hui Zou and Trevor Hastie Journal of Royal Statistical Society, B, 2005 Presenter: Minhua Chen, Nov. 07, 2008 p. 2/1 Agenda Introduction
More informationStatistical learning on graphs
Statistical learning on graphs Jean-Philippe Vert Jean-Philippe.Vert@ensmp.fr ParisTech, Ecole des Mines de Paris Institut Curie INSERM U900 Seminar of probabilities, Institut Joseph Fourier, Grenoble,
More informationPackage Grace. R topics documented: April 9, Type Package
Type Package Package Grace April 9, 2017 Title Graph-Constrained Estimation and Hypothesis Tests Version 0.5.3 Date 2017-4-8 Author Sen Zhao Maintainer Sen Zhao Description Use
More informationLecture 14: Shrinkage
Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the
More informationThe lasso: some novel algorithms and applications
1 The lasso: some novel algorithms and applications Newton Institute, June 25, 2008 Robert Tibshirani Stanford University Collaborations with Trevor Hastie, Jerome Friedman, Holger Hoefling, Gen Nowak,
More informationGroup lasso for genomic data
Group lasso for genomic data Jean-Philippe Vert Mines ParisTech / Curie Institute / Inserm Machine learning: Theory and Computation workshop, IMA, Minneapolis, March 26-3, 22 J.P Vert (ParisTech) Group
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More information2.3. Clustering or vector quantization 57
Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationConvex relaxation for Combinatorial Penalties
Convex relaxation for Combinatorial Penalties Guillaume Obozinski Equipe Imagine Laboratoire d Informatique Gaspard Monge Ecole des Ponts - ParisTech Joint work with Francis Bach Fête Parisienne in Computation,
More informationDATA MINING AND MACHINE LEARNING
DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems
More informationStatistical aspects of prediction models with high-dimensional data
Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by
More informationBuilding a Prognostic Biomarker
Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,
More informationFeature Grouping and Selection Over an Undirected Graph
Feature Grouping and Selection Over an Undirected Graph Sen Yang, Lei Yuan, Ying-Cheng Lai, Xiaotong Shen, Peter Wonka, and Jieping Ye 1 Introduction High-dimensional regression/classification is challenging
More informationarxiv: v1 [stat.ml] 18 Jan 2010
Improving stability and interpretability of gene expression signatures arxiv:1001.3109v1 [stat.ml] 18 Jan 2010 Anne-Claire Haury Mines ParisTech, CBIO Institut Curie, Paris, F-75248 INSERM, U900, Paris,
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationGenetic Networks. Korbinian Strimmer. Seminar: Statistical Analysis of RNA-Seq Data 19 June IMISE, Universität Leipzig
Genetic Networks Korbinian Strimmer IMISE, Universität Leipzig Seminar: Statistical Analysis of RNA-Seq Data 19 June 2012 Korbinian Strimmer, RNA-Seq Networks, 19/6/2012 1 Paper G. I. Allen and Z. Liu.
More informationMS-C1620 Statistical inference
MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents
More informationPre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models
Pre-Selection in Cluster Lasso Methods for Correlated Variable Selection in High-Dimensional Linear Models Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider variable
More informationSaharon Rosset 1 and Ji Zhu 2
Aust. N. Z. J. Stat. 46(3), 2004, 505 510 CORRECTED PROOF OF THE RESULT OF A PREDICTION ERROR PROPERTY OF THE LASSO ESTIMATOR AND ITS GENERALIZATION BY HUANG (2003) Saharon Rosset 1 and Ji Zhu 2 IBM T.J.
More informationProceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence. Uncorrelated Lasso
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence Uncorrelated Lasso Si-Bao Chen School of Computer Science and Technology, Anhui University, Hefei, 23060, China Chris Ding Department
More informationUnivariate shrinkage in the Cox model for high dimensional data
Univariate shrinkage in the Cox model for high dimensional data Robert Tibshirani January 6, 2009 Abstract We propose a method for prediction in Cox s proportional model, when the number of features (regressors)
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature
More informationESL Chap3. Some extensions of lasso
ESL Chap3 Some extensions of lasso 1 Outline Consistency of lasso for model selection Adaptive lasso Elastic net Group lasso 2 Consistency of lasso for model selection A number of authors have studied
More informationLinear Regression. Aarti Singh. Machine Learning / Sept 27, 2010
Linear Regression Aarti Singh Machine Learning 10-701/15-781 Sept 27, 2010 Discrete to Continuous Labels Classification Sports Science News Anemic cell Healthy cell Regression X = Document Y = Topic X
More informationA Short Introduction to the Lasso Methodology
A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael
More informationThe prediction of house price
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationInstitute of Statistics Mimeo Series No Simultaneous regression shrinkage, variable selection and clustering of predictors with OSCAR
DEPARTMENT OF STATISTICS North Carolina State University 2501 Founders Drive, Campus Box 8203 Raleigh, NC 27695-8203 Institute of Statistics Mimeo Series No. 2583 Simultaneous regression shrinkage, variable
More informationSemi-Penalized Inference with Direct FDR Control
Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationRegression Model In The Analysis Of Micro Array Data-Gene Expression Detection
Jamal Fathima.J.I 1 and P.Venkatesan 1. Research Scholar -Department of statistics National Institute For Research In Tuberculosis, Indian Council For Medical Research,Chennai,India,.Department of statistics
More informationA Study of Network-based Kernel Methods on Protein-Protein Interaction for Protein Functions Prediction
The Third International Symposium on Optimization and Systems Biology (OSB 09) Zhangjiajie, China, September 20 22, 2009 Copyright 2009 ORSC & APORC, pp. 25 32 A Study of Network-based Kernel Methods on
More informationECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference
ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring
More informationGeneralized Elastic Net Regression
Abstract Generalized Elastic Net Regression Geoffroy MOURET Jean-Jules BRAULT Vahid PARTOVINIA This work presents a variation of the elastic net penalization method. We propose applying a combined l 1
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationHierarchical Penalization
Hierarchical Penalization Marie Szafransi 1, Yves Grandvalet 1, 2 and Pierre Morizet-Mahoudeaux 1 Heudiasyc 1, UMR CNRS 6599 Université de Technologie de Compiègne BP 20529, 60205 Compiègne Cedex, France
More informationNikhil Rao, Miroslav Dudík, Zaid Harchaoui
THE GROUP k-support NORM FOR LEARNING WITH STRUCTURED SPARSITY Nikhil Rao, Miroslav Dudík, Zaid Harchaoui Technicolor R&I, Microsoft Research, University of Washington ABSTRACT Several high-dimensional
More informationSTAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă
STAT 535 Lecture 5 November, 2018 Brief overview of Model Selection and Regularization c Marina Meilă mmp@stat.washington.edu Reading: Murphy: BIC, AIC 8.4.2 (pp 255), SRM 6.5 (pp 204) Hastie, Tibshirani
More informationFrom Lasso regression to Feature vector machine
From Lasso regression to Feature vector machine Fan Li, Yiming Yang and Eric P. Xing,2 LTI and 2 CALD, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA USA 523 {hustlf,yiming,epxing}@cs.cmu.edu
More informationMachine Learning for Economists: Part 4 Shrinkage and Sparsity
Machine Learning for Economists: Part 4 Shrinkage and Sparsity Michal Andrle International Monetary Fund Washington, D.C., October, 2018 Disclaimer #1: The views expressed herein are those of the authors
More informationCorrelate. A method for the integrative analysis of two genomic data sets
Correlate A method for the integrative analysis of two genomic data sets Sam Gross, Balasubramanian Narasimhan, Robert Tibshirani, and Daniela Witten February 19, 2010 Introduction Sparse Canonical Correlation
More informationGene expression microarray technology measures the expression levels of thousands of genes. Research Article
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 7, Number 2, 2 # Mary Ann Liebert, Inc. Pp. 8 DOI:.89/cmb.29.52 Research Article Reducing the Computational Complexity of Information Theoretic Approaches for Reconstructing
More information27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1
10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationEffective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA
More informationA Bias Correction for the Minimum Error Rate in Cross-validation
A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.
More informationFast Regularization Paths via Coordinate Descent
August 2008 Trevor Hastie, Stanford Statistics 1 Fast Regularization Paths via Coordinate Descent Trevor Hastie Stanford University joint work with Jerry Friedman and Rob Tibshirani. August 2008 Trevor
More informationHigh-Dimensional Statistical Learning: Introduction
Classical Statistics Biological Big Data Supervised and Unsupervised Learning High-Dimensional Statistical Learning: Introduction Ali Shojaie University of Washington http://faculty.washington.edu/ashojaie/
More informationSpectral Graph Wavelets on the Cortical Connectome and Regularization of the EEG Inverse Problem
Spectral Graph Wavelets on the Cortical Connectome and Regularization of the EEG Inverse Problem David K Hammond University of Oregon / NeuroInformatics Center International Conference on Industrial and
More informationMachine Learning. Regression basics. Marc Toussaint University of Stuttgart Summer 2015
Machine Learning Regression basics Linear regression, non-linear features (polynomial, RBFs, piece-wise), regularization, cross validation, Ridge/Lasso, kernel trick Marc Toussaint University of Stuttgart
More informationCompressed Sensing in Cancer Biology? (A Work in Progress)
Compressed Sensing in Cancer Biology? (A Work in Progress) M. Vidyasagar FRS Cecil & Ida Green Chair The University of Texas at Dallas M.Vidyasagar@utdallas.edu www.utdallas.edu/ m.vidyasagar University
More informationTGDR: An Introduction
TGDR: An Introduction Julian Wolfson Student Seminar March 28, 2007 1 Variable Selection 2 Penalization, Solution Paths and TGDR 3 Applying TGDR 4 Extensions 5 Final Thoughts Some motivating examples We
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationHomework 1: Solutions
Homework 1: Solutions Statistics 413 Fall 2017 Data Analysis: Note: All data analysis results are provided by Michael Rodgers 1. Baseball Data: (a) What are the most important features for predicting players
More informationSparsity in Underdetermined Systems
Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2
More informationhsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference
CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science
More informationA new approach for biomarker detection using fusion networks
A new approach for biomarker detection using fusion networks Master thesis Bioinformatics Mona Rams June 15, 2016 Department Mathematik und Informatik Freie Universität Berlin 2 Statutory Declaration I
More informationSpatial Lasso with Application to GIS Model Selection. F. Jay Breidt Colorado State University
Spatial Lasso with Application to GIS Model Selection F. Jay Breidt Colorado State University with Hsin-Cheng Huang, Nan-Jung Hsu, and Dave Theobald September 25 The work reported here was developed under
More informationCovariance-regularized regression and classification for high-dimensional problems
Covariance-regularized regression and classification for high-dimensional problems Daniela M. Witten Department of Statistics, Stanford University, 390 Serra Mall, Stanford CA 94305, USA. E-mail: dwitten@stanford.edu
More informationIterative Laplacian Score for Feature Selection
Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,
More informationVariable Selection for High-Dimensional Data with Spatial-Temporal Effects and Extensions to Multitask Regression and Multicategory Classification
Variable Selection for High-Dimensional Data with Spatial-Temporal Effects and Extensions to Multitask Regression and Multicategory Classification Tong Tong Wu Department of Epidemiology and Biostatistics
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationDirect Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:
More informationA New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables
A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,
More informationA short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie
A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab
More informationSupervised classification for structured data: Applications in bio- and chemoinformatics
Supervised classification for structured data: Applications in bio- and chemoinformatics Jean-Philippe Vert Jean-Philippe.Vert@mines-paristech.fr Mines ParisTech / Curie Institute / Inserm The third school
More informationProteomics and Variable Selection
Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial
More informationConsistent high-dimensional Bayesian variable selection via penalized credible regions
Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable
More informationLeast Angle Regression, Forward Stagewise and the Lasso
January 2005 Rob Tibshirani, Stanford 1 Least Angle Regression, Forward Stagewise and the Lasso Brad Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani Stanford University Annals of Statistics,
More informationA Least Squares Formulation for Canonical Correlation Analysis
A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationLecture 14: Variable Selection - Beyond LASSO
Fall, 2017 Extension of LASSO To achieve oracle properties, L q penalty with 0 < q < 1, SCAD penalty (Fan and Li 2001; Zhang et al. 2007). Adaptive LASSO (Zou 2006; Zhang and Lu 2007; Wang et al. 2007)
More informationMultimodal Deep Learning for Predicting Survival from Breast Cancer
Multimodal Deep Learning for Predicting Survival from Breast Cancer Heather Couture Deep Learning Journal Club Nov. 16, 2016 Outline Background on tumor histology & genetic data Background on survival
More informationDimension Reduction Methods
Dimension Reduction Methods And Bayesian Machine Learning Marek Petrik 2/28 Previously in Machine Learning How to choose the right features if we have (too) many options Methods: 1. Subset selection 2.
More informationPredicting Graph Labels using Perceptron. Shuang Song
Predicting Graph Labels using Perceptron Shuang Song shs037@eng.ucsd.edu Online learning over graphs M. Herbster, M. Pontil, and L. Wainer, Proc. 22nd Int. Conf. Machine Learning (ICML'05), 2005 Prediction
More informationWavelets on Graphs, an Introduction
Wavelets on Graphs, an Introduction Pierre Vandergheynst and David Shuman Ecole Polytechnique Fédérale de Lausanne (EPFL) Signal Processing Laboratory {pierre.vandergheynst,david.shuman}@epfl.ch Université
More informationA Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression
A Blockwise Descent Algorithm for Group-penalized Multiresponse and Multinomial Regression Noah Simon Jerome Friedman Trevor Hastie November 5, 013 Abstract In this paper we purpose a blockwise descent
More informationSparse regularization for functional logistic regression models
Sparse regularization for functional logistic regression models Hidetoshi Matsui The Center for Data Science Education and Research, Shiga University 1-1-1 Banba, Hikone, Shiga, 5-85, Japan. hmatsui@biwako.shiga-u.ac.jp
More informationStatistical Analysis of Data Generated by a Mixture of Two Parametric Distributions
UDC 519.2 Statistical Analysis of Data Generated by a Mixture of Two Parametric Distributions Yu. K. Belyaev, D. Källberg, P. Rydén Department of Mathematics and Mathematical Statistics, Umeå University,
More informationClassification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).
Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes
More informationSVRG++ with Non-uniform Sampling
SVRG++ with Non-uniform Sampling Tamás Kern András György Department of Electrical and Electronic Engineering Imperial College London, London, UK, SW7 2BT {tamas.kern15,a.gyorgy}@imperial.ac.uk Abstract
More informationVariable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1
Variable Selection in Restricted Linear Regression Models Y. Tuaç 1 and O. Arslan 1 Ankara University, Faculty of Science, Department of Statistics, 06100 Ankara/Turkey ytuac@ankara.edu.tr, oarslan@ankara.edu.tr
More informationMachine Learning for Biomedical Engineering. Enrico Grisan
Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Curse of dimensionality Why are more features bad? Redundant features (useless or confounding) Hard to interpret and
More informationFused elastic net EPoC
Fused elastic net EPoC Tobias Abenius November 28, 2013 Data Set and Method Use mrna and copy number aberration (CNA) to build a network using an extension of our EPoC method (Jörnsten et al., 2011; Abenius
More informationAn efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss
An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss arxiv:1811.04545v1 [stat.co] 12 Nov 2018 Cheng Wang School of Mathematical Sciences, Shanghai Jiao
More informationGrouping Pursuit in regression. Xiaotong Shen
Grouping Pursuit in regression Xiaotong Shen School of Statistics University of Minnesota Email xshen@stat.umn.edu Joint with Hsin-Cheng Huang (Sinica, Taiwan) Workshop in honor of John Hartigan Innovation
More informationRegularization Path Algorithms for Detecting Gene Interactions
Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable
More informationRegression.
Regression www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts linear regression RMSE, MAE, and R-square logistic regression convex functions and sets
More informationParameter Norm Penalties. Sargur N. Srihari
Parameter Norm Penalties Sargur N. srihari@cedar.buffalo.edu 1 Regularization Strategies 1. Parameter Norm Penalties 2. Norm Penalties as Constrained Optimization 3. Regularization and Underconstrained
More informationProperties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation
Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation Adam J. Rothman School of Statistics University of Minnesota October 8, 2014, joint work with Liliana
More informationAutomatic Response Category Combination. in Multinomial Logistic Regression
Automatic Response Category Combination in Multinomial Logistic Regression Bradley S. Price, Charles J. Geyer, and Adam J. Rothman arxiv:1705.03594v1 [stat.me] 10 May 2017 Abstract We propose a penalized
More informationAccepted author version posted online: 28 Nov 2012.
This article was downloaded by: [University of Minnesota Libraries, Twin Cities] On: 15 May 2013, At: 12:23 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954
More informationHigh-dimensional regression with unknown variance
High-dimensional regression with unknown variance Christophe Giraud Ecole Polytechnique march 2012 Setting Gaussian regression with unknown variance: Y i = f i + ε i with ε i i.i.d. N (0, σ 2 ) f = (f
More informationSimultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR
Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR Howard D. Bondell and Brian J. Reich Department of Statistics, North Carolina State University,
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationBi-level feature selection with applications to genetic association
Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may
More informationIntroduction to the genlasso package
Introduction to the genlasso package Taylor B. Arnold, Ryan Tibshirani Abstract We present a short tutorial and introduction to using the R package genlasso, which is used for computing the solution path
More information