Regularized non-negative matrix factorization for latent component discovery in heterogeneous methylomes
|
|
- Erica Paul
- 5 years ago
- Views:
Transcription
1 Regularized non-negative matrix factorization for latent component discovery in heterogeneous methylomes N. Vedeneev, P. Lutsik 2, M. Slawski 3, J. Walter, M. Hein Saarland University, Germany, 2 DKFZ, Germany, 3 George Mason University, USA Problem Setting: DNA methylation is one of the most important and studied marks in modern epigenetics. It plays a central role in numerous biological processes, such as stem cell differentiation, embryonic development and diseases like cancer []. Each (sub)-cell type has a characteristic methylation profile. For some tissue, e.g. blood, individual cell types can be isolated and their methlyation profile can be measured directly. However, for some tissue, e.g. brain, such cell separation is difficult and in many cases only methylation measurement of a full tissue is available which is a mixture of cell types. Reliable and accurate identification of the cell type-specific methylation patterns and their corresponding mixture proportions is of great benefit for any subsequent biological analysis. This can be seen as a blind deconvolution problem which we tackle using a problem-adapted form of a non-negative matrix factorization. In mammalian genomes, including human, DNA methylation mark occurs predominantly in the context of CpG dinucleotide sequence motifs (CpGs). While state of the art sequencingbased methods allow to obtain the complete DNA methylation profile, the cost- and labouroptimized approaches, such as the Infinium 450k and EPIC microarrays, cover a small, yet representative subset of CpGs in the human genome (m = and m = , respectively). For each CpG i of total m tagged ones and each subject j... n Infinium microarrays output two intensity measurements M ij and U ij proportional to the number of DNA molecules in which this CpG is methylated and unmethylated, respectively. We define N ij = M ij + U ij to be the total microarray intensity at each CpG. The data matrix D R m n, is the ratio D ij = M ij /N ij. We denote by r the total number of latent components (typically associated with cell types). In practice, as a rule m n > r. Regularized NMF: Let T R m r + be a matrix representing r latent components, or pure methylation profiles, and A R r n + a matrix of their proportions, or mixtures. Our experiments [6] suggest that a linear mixture model is plausible, i.e: D T A and columns of T are affinely independent. As on a single cell level methylation is binary (on or off) we modeled in the previous work [7] T to be a binary matrix in {0, } m r. While we have shown [7], that such a constraint implies uniqueness of the factorization beyond conditions such as separability [3], it turns out that this constraint is too rigid as even cells of the same cell type can have different methylation at a small fraction of the sites and thus have to be seen as a statistical mixture. In Figure we plot the histogram of measured methlyation values of an isolated cell type. While most of the measurements are concentrated close to zero and one, there is a certain fraction of intermediate DNA methylation. This motivated us in 29th Conference on Neural Information Processing Systems (NIPS 206), Barcelona, Spain.
2 current work [6], the method is called MeDeCom, to relax the constraint to T [0, ] m r and solve the following regularized NMF problem minimize T IR m r,a IR r n 2 D T A 2 F + λ subject to m i= j= T [0, ] m r, A R r n +, n T ij ( T ij ) r A ij =, j =,..., n. i= () The constraint on A is added so that one can interpret A ij as the proportion of latent component i in sample j. As we illustrate below the factorization without regularizer is highly non-unique as in this application we are far away from the separability condition. However, we know from the property of single cells that most methylation sites of a certain cell type should be close to 0 or. Thus we enforce by the non-convex regularization term the entries of T to satisfy this property. As Figure 2 shows, the regularizer is the key to more accurate estimation of both the matrix T and the proportions A as it biases the estimated latent components T towards 0 or. Note that the factorization of the data (blue dots) is highly non-unique as there exist infinitely many solutions with zero fit. However, the strong prior resolves this non-uniqueness. This example, although artificial, is a demonstration of a biologically relevant extreme hard case where the proportions of the cell types across samples are only varying very little and the methylation profiles of the identified cell types in T are highly correlated or, geometrically speaking, when the observations are compactly concentrated inside the simplex spanned by the columns of T and far away from some/all of its vertices/facets (which is exactly the opposite of the separability condition [3]). In instances like in Figure 2 we outperform [5], which is basically based on standard NMF, by large margin Figure : Histogram of T values Figure 2: Hard toy case: observations are deep inside the simplex and away from its boundaries. m = 2, r = 3 Contribution: In the workshop submission we build upon [6] and address two issues: we study families of regularizers which enforce the entries of T being close to zero or one and we examine how the influence of the highly varying number of cells measured at each site should influence the loss used in the NMF model. We solve the optimization problem (and the variants discussed below) via alternating optimization of T and A where we use DCA [8] for the non-convex problem in T. Regularizer: In a number of experiments we solved problem () and compared different regularizers like:. Regularizers enforcing a bias towards {0, } represented by a function family S α,β modeled as concave two-piece cubic splines with a junction at /2 such that S α,β = {s : [0, ] R s(0) = s() = 0, s( /2) = /4, s ( /2) = 0, s (0) = α, s () = β}, where α, β are user-defined parameters controlling slopes at 0 and. Constraints at /4 are in order to make regularizers from S α,β comparable with (). It is worth noting that 2
3 once α and β are specified, any regularizer s S α,β is uniquely defined (4 parameters per cubic piece with 2+2 equalities delivered by the values and derivatives at the end-points). 2. Regularizers derived from a prior based on the distribution of the methylation values in isolated cell types. Here we use the known correspondence between MAP estimation of a prior and a corresponding regularizer and fit a parametric model to the measured distribution. In our experiments the prior distribution was first fit using kernel methods, then the values of the negative log of the density estimate were further fit with polynomials of different degrees. The resulting functions were used as regularizers. We observed that the sensitivity of the regularization parameter, selected by cross-validation, is influenced by the choice of the regularizer. However, in experiments we concentrate on the effect of loss, as it turns out that different regularizers yield similar results given that the grid of λ-values is fine enough. Modified Loss for NMF: The squared loss [SL] in () is associated with a Gaussian noise model. However, as the data matrix consists of ratios of discrete measurements, this noise model might not be appropriate. In particular, the squared loss does not take into account that the number of measured cells N ij for each site i and sample j varies by orders of magnitude ( , this is a consequence of the measurement technique of the Infinium HumanMethylation450 BeadChip) and also does not consider the discrete nature of the measurements. Recall that M ij is the number of methylated cells in N ij measurements and each measurement is done for each cell with the same method. Therefore we interpret D ij as the sample mean of N ij independent Bernoulli trials with the sample variance estimate ˆσ ij 2 = Dij( Dij) N ij. Since N ij is relatively large, ˆσ ij 2 is a good estimate of the true variance σij 2. Furthermore, for large N ij the distribution of the sample mean D ij can be well approximated by a Gaussian distribution and thus we arrive at the noise model: D ij N ( r k= T ika kj, σij 2 ), i =,..., m, j =,..., n. The new loss (negative log likelihood of the noise distribution) is a weighted squared loss [WSL] given by: Λ (D T A) 2 F where Λ ij = ˆσ ij and is the Hadamard product. To avoid numerical problems we truncate Λ at the 0.95 quantile of its values distribution to account for very small variance estimates. We discuss the effect of this adapted loss compared to the standard squared loss. Real data experiment: We use well-known annotated data from the study [4], where brain cell nuclei from a single brain sample were separated for a neuron-specific marker NeuN using fluorescence activated cell sorting (FACS) and the obtained NeuN + (neuronal) and NeuN (non-neuronal) fractions were mixed subsequently into mixtures with proportions 0.0 : 0. :.0. The mixtures were then profiled using the Infinium 450K array. We use this dataset to uncover source NeuN +\ methylomes and their mixing proportions. In order to make the problem harder and more realistic we only use five mixtures corresponding to proportions 0.3 : 0. : 0.7. While the mixture ground-truth data is known, the reference profiles are not available and thus we use T ref - the average of reference profiles from 30 different patients. Thus our estimated error in the T matrix is only an approximation. The Infinium 450k microarray includes slightly more than primer extension assays of two different design types known as type I and type II probes. Methylation levels from two array types were shown to have diffent properties [2]. In order to exclude a bias in the data the analysis on the homogeneous subset of type I probes (comprising approximately one third of the 450k array) is performed. CpGs with probes that overlapped with annotated SNP positions (dbsnp32 entries with MAF> 0.05, as defined in the RnBeads.hg9 annotation) along the complete probe sequence were also discarded to eliminate the confounding genetic variability. This way the data was reduced to the matrix D R Furthermore, in a refined setting several additional layers of quality filtering were used, i.e. each methylation value was required to be supported by at least 5 Infinium beads and CpGs with extreme intensities (< 0. and > 0.95 quantiles of all intensity values) were also removed as potentially erroneous - this is how matrix D R is obtained. Both matrices D and D are factorized in order to assess the robustness of different loss 3
4 functions with respect to noise. We use leave one out cross-validation scheme across samples to determine the optimal regularization parameter. The set of possible λ-values is {0, α 0 k α {, 2.5, 5, 7.5}, k { 5, 4, 3, 2}}. For the weighted squared loss λ was further scaled by the median of the Λ matrix values distribution in order to make the range of employed regularization parameters comparable. Figure 3 shows how the estimation  of the true mixture matrix A is affected by the choice of the loss. Table below provides a quantitative evaluation. Proportion of NeuN Ground truth SL for raw Type I, lambda=e-03 WSL for raw Type I, lambda= SL for filtered Type I, lambda=5e-04 WSL for filtered Type I, lambda= Samples Figure 3: A (ground truth) versus the estimated  returned by different variants of the regularized algorithm after matching with T ref. The NeuN + proportions of all  matrices correspond to the ascending order that of A. As we can see SL underestimates the NeuN + proportions for all the samples for the case of the noisy raw Type data, whereas WSL is not seriously affected by the noise and provides a reasonable estimate of the proportions. The reason for the failure of SL in this case is that the cross-validation routine selects the wrong parameter, which might be due to the more noisy version of the error measure. Both SL and WSL produce very good estimates (2.5% resp..3%) for the filtered case, again WSL outperforms SL slightly. Thus we have shown in this experiment that integrating the information about the total intensity level and the variance information for each CpG site leads both to more robustness against noise but also to a more accurate estimation of the proportions. We also compared the ˆT estimates against T ref. However, as we do not have ground truth information on T but just the average of the reference profiles T ref, this comparison should be seen rather as a sanity check. The results are summarized in Table. We see that on average the estimation of the matrix T is good in all cases. As expected both WSL and SL perform better in the T estimation on the filtered data. The large maximal absolute error can be explained by the fact that T ref is not the true ground truth data and thus is correct only on average but not on the level of single CpG sites. rn  A  A rm ˆT T ref ˆT T ref SL, raw Type I WSL, raw Type I SL, filtered Type I WSL, filtered Type I Table : Comparison of estimates ( ˆT, Â) and T ref against A and T ref on the raw and filtered Type I dataset. Conclusions: The Infinium microarray is an efficient tool for measuring DNA methylations on the genome scale yet the data produced is contaminated by measurement noise and biases of various nature and origin. We addressed those issues by assessing the importance of biologically motivated regularizers and adequate noise modeling. In future work we will further study the good result of WSL for the neural data [4] on more datasets. 4
5 References [] C. Bock. Analysing and interpreting DNA methylation data. Nature Reviews Genetics, 3:705 79, 202. [2] S. Dedeurwaerder, M. Defrance, M. Bizet, E. Calonne, G. Bontempi, and F. Fuks. A comprehensive overview of Infinium HumanMethylation450 data processing. Brief. Bioinformatics, 5(6):929 94, Nov 204. [3] D. Donoho and V. Stodden. When does non-negative matrix factorization give a correct decomposition into parts? In NIPS, [4] J. Guintivano, M. J. Aryee, and Z. a. Kaminsky. A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics, 8(3): , 203. [5] E.A. Houseman, M.L. Kile, D.C. Christiani, T.A. Ince, K.T. Kelsey, and C.J. Marsit. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC Bioinformatics, 7(259), 206. [6] P. Lutsik, M. Slawski, G. Gasparoni, M. Hein, and J. Walter. MeDeCom discovers and quantifies latent components of heterogeneous methylomes. Submitted. [7] M. Slawski, M. Hein, and P. Lutsik. Matrix factorization with Binary Components. In NIPS, 203. [8] P.D. Tao and L.T.H. An. Difference of Convex Functions Optimization Algorithms (DCA) for Globally Minimizing Nonconvex Quadratic Forms on Euclidean Balls and Spheres. Oper. Res. Lett., 9(5):207 26, November
Measures of hydroxymethylation
Measures of hydroxymethylation Alla Slynko Axel Benner July 22, 2018 arxiv:1708.04819v2 [q-bio.qm] 17 Aug 2017 Abstract Hydroxymethylcytosine (5hmC) methylation is well-known epigenetic mark impacting
More informationTechnologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA
Technologie w skali genomowej 2/ Algorytmiczne i statystyczne aspekty sekwencjonowania DNA Expression analysis for RNA-seq data Ewa Szczurek Instytut Informatyki Uniwersytet Warszawski 1/35 The problem
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationStatistical Methods for Analysis of Genetic Data
Statistical Methods for Analysis of Genetic Data Christopher R. Cabanski A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements
More informationa Short Introduction
Collaborative Filtering in Recommender Systems: a Short Introduction Norm Matloff Dept. of Computer Science University of California, Davis matloff@cs.ucdavis.edu December 3, 2016 Abstract There is a strong
More informationCS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu
CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge
More informationData Preprocessing. Data Preprocessing
Data Preprocessing 1 Data Preprocessing Normalization: the process of removing sampleto-sample variations in the measurements not due to differential gene expression. Bringing measurements from the different
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationBioinformatics 2 - Lecture 4
Bioinformatics 2 - Lecture 4 Guido Sanguinetti School of Informatics University of Edinburgh February 14, 2011 Sequences Many data types are ordered, i.e. you can naturally say what is before and what
More informationLinear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction
Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationAn indicator for the number of clusters using a linear map to simplex structure
An indicator for the number of clusters using a linear map to simplex structure Marcus Weber, Wasinee Rungsarityotin, and Alexander Schliep Zuse Institute Berlin ZIB Takustraße 7, D-495 Berlin, Germany
More informationParametric Empirical Bayes Methods for Microarrays
Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions
More informationIntroduction to Bioinformatics
CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationFull versus incomplete cross-validation: measuring the impact of imperfect separation between training and test sets in prediction error estimation
cross-validation: measuring the impact of imperfect separation between training and test sets in prediction error estimation IIM Joint work with Christoph Bernau, Caroline Truntzer, Thomas Stadler and
More information9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients
What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate
More informationEstimation of linear non-gaussian acyclic models for latent factors
Estimation of linear non-gaussian acyclic models for latent factors Shohei Shimizu a Patrik O. Hoyer b Aapo Hyvärinen b,c a The Institute of Scientific and Industrial Research, Osaka University Mihogaoka
More informationShort Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning
Short Course Robust Optimization and 3. Optimization in Supervised EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 Outline Overview of Supervised models and variants
More informationNon-Negative Factorization for Clustering of Microarray Data
INT J COMPUT COMMUN, ISSN 1841-9836 9(1):16-23, February, 2014. Non-Negative Factorization for Clustering of Microarray Data L. Morgos Lucian Morgos Dept. of Electronics and Telecommunications Faculty
More information19 : Bayesian Nonparametrics: The Indian Buffet Process. 1 Latent Variable Models and the Indian Buffet Process
10-708: Probabilistic Graphical Models, Spring 2015 19 : Bayesian Nonparametrics: The Indian Buffet Process Lecturer: Avinava Dubey Scribes: Rishav Das, Adam Brodie, and Hemank Lamba 1 Latent Variable
More informationIntroduction to Support Vector Machines
Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear
More informationPrincipal component analysis (PCA) for clustering gene expression data
Principal component analysis (PCA) for clustering gene expression data Ka Yee Yeung Walter L. Ruzzo Bioinformatics, v17 #9 (2001) pp 763-774 1 Outline of talk Background and motivation Design of our empirical
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationCPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018
CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,
More informationCPSC 340: Machine Learning and Data Mining. More PCA Fall 2017
CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).
More informationDiscriminative Direction for Kernel Classifiers
Discriminative Direction for Kernel Classifiers Polina Golland Artificial Intelligence Lab Massachusetts Institute of Technology Cambridge, MA 02139 polina@ai.mit.edu Abstract In many scientific and engineering
More informationInferring Transcriptional Regulatory Networks from Gene Expression Data II
Inferring Transcriptional Regulatory Networks from Gene Expression Data II Lectures 9 Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday
More informationLecture 6. Regression
Lecture 6. Regression Prof. Alan Yuille Summer 2014 Outline 1. Introduction to Regression 2. Binary Regression 3. Linear Regression; Polynomial Regression 4. Non-linear Regression; Multilayer Perceptron
More informationA new strategy for meta-analysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston
A new strategy for meta-analysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials
More informationEfficient Variational Inference in Large-Scale Bayesian Compressed Sensing
Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing George Papandreou and Alan Yuille Department of Statistics University of California, Los Angeles ICCV Workshop on Information
More informationEmpirical Bayes Moderation of Asymptotically Linear Parameters
Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationLinking non-binned spike train kernels to several existing spike train metrics
Linking non-binned spike train kernels to several existing spike train metrics Benjamin Schrauwen Jan Van Campenhout ELIS, Ghent University, Belgium Benjamin.Schrauwen@UGent.be Abstract. This work presents
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More informationBayesian Regression of Piecewise Constant Functions
Marcus Hutter - 1 - Bayesian Regression of Piecewise Constant Functions Bayesian Regression of Piecewise Constant Functions Marcus Hutter Istituto Dalle Molle di Studi sull Intelligenza Artificiale IDSIA,
More informationLatent Dirichlet Allocation Introduction/Overview
Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationSupport Vector Machine & Its Applications
Support Vector Machine & Its Applications A portion (1/3) of the slides are taken from Prof. Andrew Moore s SVM tutorial at http://www.cs.cmu.edu/~awm/tutorials Mingyue Tan The University of British Columbia
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationPART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics
Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability
More informationSupport'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan
Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More informationAn Introductory Course in Computational Neuroscience
An Introductory Course in Computational Neuroscience Contents Series Foreword Acknowledgments Preface 1 Preliminary Material 1.1. Introduction 1.1.1 The Cell, the Circuit, and the Brain 1.1.2 Physics of
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationMixtures and Hidden Markov Models for analyzing genomic data
Mixtures and Hidden Markov Models for analyzing genomic data Marie-Laure Martin-Magniette UMR AgroParisTech/INRA Mathématique et Informatique Appliquées, Paris UMR INRA/UEVE ERL CNRS Unité de Recherche
More informationSupplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control
Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationA Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag
A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data
More informationBinary matrix completion
Binary matrix completion Yaniv Plan University of Michigan SAMSI, LDHD workshop, 2013 Joint work with (a) Mark Davenport (b) Ewout van den Berg (c) Mary Wootters Yaniv Plan (U. Mich.) Binary matrix completion
More informationSeminar Microarray-Datenanalyse
Seminar Microarray- Normalization Hans-Ulrich Klein Christian Ruckert Institut für Medizinische Informatik WWU Münster SS 2011 Organisation 1 09.05.11 Normalisierung 2 10.05.11 Bestimmen diff. expr. Gene,
More informationMixture models for analysing transcriptome and ChIP-chip data
Mixture models for analysing transcriptome and ChIP-chip data Marie-Laure Martin-Magniette French National Institute for agricultural research (INRA) Unit of Applied Mathematics and Informatics at AgroParisTech,
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationSparse Linear Models (10/7/13)
STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More informationState-Space Methods for Inferring Spike Trains from Calcium Imaging
State-Space Methods for Inferring Spike Trains from Calcium Imaging Joshua Vogelstein Johns Hopkins April 23, 2009 Joshua Vogelstein (Johns Hopkins) State-Space Calcium Imaging April 23, 2009 1 / 78 Outline
More informationDiffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)
Diffeomorphic Warping Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel) What Manifold Learning Isn t Common features of Manifold Learning Algorithms: 1-1 charting Dense sampling Geometric Assumptions
More informationLow-Level Analysis of High- Density Oligonucleotide Microarray Data
Low-Level Analysis of High- Density Oligonucleotide Microarray Data Ben Bolstad http://www.stat.berkeley.edu/~bolstad Biostatistics, University of California, Berkeley UC Berkeley Feb 23, 2004 Outline
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationEstimating terminal half life by non-compartmental methods with some data below the limit of quantification
Paper SP08 Estimating terminal half life by non-compartmental methods with some data below the limit of quantification Jochen Müller-Cohrs, CSL Behring, Marburg, Germany ABSTRACT In pharmacokinetic studies
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Bradley Broom Department of Bioinformatics and Computational Biology UT M. D. Anderson Cancer Center kabagg@mdanderson.org bmbroom@mdanderson.org
More informationCONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT
CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org
More informationGS Analysis of Microarray Data
GS01 0163 Analysis of Microarray Data Keith Baggerly and Kevin Coombes Section of Bioinformatics Department of Biostatistics and Applied Mathematics UT M. D. Anderson Cancer Center kabagg@mdanderson.org
More informationLecture 1: Systems of linear equations and their solutions
Lecture 1: Systems of linear equations and their solutions Course overview Topics to be covered this semester: Systems of linear equations and Gaussian elimination: Solving linear equations and applications
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method
CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that
More informationINTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP
INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology
More informationMidterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.
CS 89 Spring 07 Introduction to Machine Learning Midterm Please do not open the exam before you are instructed to do so. The exam is closed book, closed notes except your one-page cheat sheet. Electronic
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 29 April, SoSe 2015 Support Vector Machines (SVMs) 1. One of
More informationHidden Markov Models with Applications in Cell Adhesion Experiments. Ying Hung Department of Statistics and Biostatistics Rutgers University
Hidden Markov Models with Applications in Cell Adhesion Experiments Ying Hung Department of Statistics and Biostatistics Rutgers University 1 Outline Introduction to cell adhesion experiments Challenges
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Infinite Feature Models: The Indian Buffet Process Eric Xing Lecture 21, April 2, 214 Acknowledgement: slides first drafted by Sinead Williamson
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationA short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie
A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab
More informationSingle gene analysis of differential expression. Giorgio Valentini
Single gene analysis of differential expression Giorgio Valentini valenti@disi.unige.it Comparing two conditions Each condition may be represented by one or more RNA samples. Using cdna microarrays, samples
More informationLecture 2: Linear Algebra Review
EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1
More informationDeep Feedforward Networks
Deep Feedforward Networks Liu Yang March 30, 2017 Liu Yang Short title March 30, 2017 1 / 24 Overview 1 Background A general introduction Example 2 Gradient based learning Cost functions Output Units 3
More informationMicroarray Data Analysis: Discovery
Microarray Data Analysis: Discovery Lecture 5 Classification Classification vs. Clustering Classification: Goal: Placing objects (e.g. genes) into meaningful classes Supervised Clustering: Goal: Discover
More informationProtein Expression Molecular Pattern Discovery by Nonnegative Principal Component Analysis
Protein Expression Molecular Pattern Discovery by Nonnegative Principal Component Analysis Xiaoxu Han and Joseph Scazzero Department of Mathematics and Bioinformatics Program Department of Accounting and
More informationPractical Applications and Properties of the Exponentially. Modified Gaussian (EMG) Distribution. A Thesis. Submitted to the Faculty
Practical Applications and Properties of the Exponentially Modified Gaussian (EMG) Distribution A Thesis Submitted to the Faculty of Drexel University by Scott Haney in partial fulfillment of the requirements
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationSub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments
Sub-Gaussian Estimators of the Mean of a Random Matrix with Entries Possessing Only Two Moments Stas Minsker University of Southern California July 21, 2016 ICERM Workshop Simple question: how to estimate
More informationSemiparametric Mixed Effects Models with Flexible Random Effects Distribution
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,
More informationMulti Omics Clustering. ABDBM Ron Shamir
Multi Omics Clustering ABDBM Ron Shamir 1 Outline Introduction Cluster of Clusters (COCA) icluster Nonnegative Matrix Factorization (NMF) Similarity Network Fusion (SNF) Multiple Kernel Learning (MKL)
More informationUsing Multiple Kernel-based Regularization for Linear System Identification
Using Multiple Kernel-based Regularization for Linear System Identification What are the Structure Issues in System Identification? with coworkers; see last slide Reglerteknik, ISY, Linköpings Universitet
More informationV003 How Reliable Is Statistical Wavelet Estimation?
V003 How Reliable Is Statistical Wavelet Estimation? J.A. Edgar* (BG Group) & M. van der Baan (University of Alberta) SUMMARY Well logs are often used for the estimation of seismic wavelets. The phase
More informationLatent Variable models for GWAs
Latent Variable models for GWAs Oliver Stegle Machine Learning and Computational Biology Research Group Max-Planck-Institutes Tübingen, Germany September 2011 O. Stegle Latent variable models for GWAs
More informationExpression Data Exploration: Association, Patterns, Factors & Regression Modelling
Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationCausal Discovery by Computer
Causal Discovery by Computer Clark Glymour Carnegie Mellon University 1 Outline 1. A century of mistakes about causation and discovery: 1. Fisher 2. Yule 3. Spearman/Thurstone 2. Search for causes is statistical
More informationNonconvex penalties: Signal-to-noise ratio and algorithms
Nonconvex penalties: Signal-to-noise ratio and algorithms Patrick Breheny March 21 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/22 Introduction In today s lecture, we will return to nonconvex
More informationThe Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee
The Learning Problem and Regularization 9.520 Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing
More informationDecision-making, inference, and learning theory. ECE 830 & CS 761, Spring 2016
Decision-making, inference, and learning theory ECE 830 & CS 761, Spring 2016 1 / 22 What do we have here? Given measurements or observations of some physical process, we ask the simple question what do
More information1 Differentiable manifolds and smooth maps. (Solutions)
1 Differentiable manifolds and smooth maps Solutions Last updated: March 17 2011 Problem 1 The state of the planar pendulum is entirely defined by the position of its moving end in the plane R 2 Since
More information1.1 Basis of Statistical Decision Theory
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of
More information