Machine learning methods to infer drug-target interaction network
|
|
- Abraham Ryan
- 5 years ago
- Views:
Transcription
1 Machine learning methods to infer drug-target interaction network Yoshihiro Yamanishi Medical Institute of Bioregulation Kyushu University
2 Outline n Background Drug-target interaction network Chemical, genomic, and pharmacological spaces n Methods to predict drug-target interactions Chemogenomic approach Pharmacogenomic approach Algorithm n Results n Concluding remarks
3 Drug-target interaction Phenotypic effect Drug interaction Primary target protein Off-target protein Efficacy Side-effect Phenotypic scale Identification of interactions between drugs and target proteins is crucial in the drug development
4 Possible data sources Genomic space etc. Target proteins etc. Drug chemical structures Chemical space etc. Phenotypic effects Pharmacological space
5 Objective n Prediction of unknown drug-target interactions on a large scale from chemical, genomic and pharmacological data Known interactions Drug Target protein Unknown interactions (to be predicted)
6 Chemogenomic approach Genomic space etc. Target proteins Chemogenomics Pharmacogenomics etc. Drug chemical structures Chemical space etc. Phenotypic effects Pharmacological space
7 Examples of the data structures Drug Target protein MAHAAQVGLQDATSPIMEELITFHDHALMIIFLICFLVLYA LFLTLTTKLTNTNISDAQEMETVWTILPAIILVLIALPSLRIL YMTDEVNDPSLTIKSIGHQWYWTYEYTDYGGLIFNSYML PPLFLEPGDLRLLDVDNRVVLPIEAPIRMMITSQDVLHSW AVPTLGLKTDAIPGRLNQTTFTATRPGVYYGQCSEICGAN HSFMPIVLELIPLKIFEMGPVFTL Chemical graph structure Amino acid sequence
8 Chemogenomic approach Strategy: Chemically similar drugs are predicted to interact with simlar target proteins (Yamanishi et al, Bioifnormatics, 2008; Faulon et al., Bioinformatics, 2008; Jacob et al, Bioinformatics, 2008, Yabuuchi et al, Mol Sys Bio, 2011) Chemical structure similarity for drugs is evaluated by a graph kernel: (K x ) ij = k x (x i, x j ) for i, j =1, 2,..., n x (Mahe et al, J Chem Inf Model, 2005) Genomic sequence similarity for target proteins is evaluated by a string kernel: (K z ) ij = k x (z i, z j ) for i, j =1, 2,..., n z (Saigo et al, Bioinformatics, 2004)
9 Binary classification approach n Classification of drug-target pairs into the interaction class or non-interaction class Support Vector Machine (SVM) with pairwise kernels Faulon et al., Bioinformatics, 24: , 2008 Jacob and Vert, Bioinformatics, 24: , 2008 Yabuuchi et al, Mol Sys Bio, 2011
10 SVM with pairwise kernels (pairwise SVM) ordinary SVM : n i=1 f (x) = a i k(x i, x ") + b where x : an object pairwise SVM : n x n z i=1 f (x,z) = a i k((x,z) i,( x ", z ")) + b where (x,z) : a compound protein pair
11 Supervised bipartie graph inference Chemical space Interaction space Genomic space known drug Compounds with similar structures are close to each other known interaction Interacting drugs and targets are connected on the graph known target Proteins with similar sequences are close to each other
12 Step 1: embedding drugs and targets on the known graph into a unified feature space R d Chemical space Interaction space Genomic space known drug Compounds with similar structures are close to each other known interaction Interacting drugs and targets are close to each other known target Proteins with similar sequences are close to each other
13 Step 2. Learning a model between the chemical/ genomic space and the interaction space Chemical space Interaction space Genomic space f x f z known drug Compounds with similar structures are close to each other known interaction Interacting drugs and targets are close to each other known target Proteins with similar sequences are close to each other
14 Step 3. Predicting unknown interactions involving new compounds/proteins after the projection Chemical space Interaction space Genomic space f x f z new compound New compounds are mapped with fx new protein New proteins are mapped with fz
15 Step 3. Predicting unknown interactions involving new compounds/proteins after the projection Chemical space Interaction space Genomic space f x f z new compound New compounds are mapped with fx predicted interaction Connect compound-protein pairs which are closer than a threshold new protein New proteins are mapped with fz
16 Mappting to a unified space Let us consider two functions to map each compound x and each protein z onto a unified Euclidian space f x (x) = ( f (1) (d % x (x),, f ) x (x)) T R d f z (z) = ( f (1) (d % z (z),, f ) z (z)) T R d We find f x and f z which minimize R( f x, f z ) = ( f x (x i ) f z (z j )) 2 + λ 1 f x 2 +λ 2 f z 2 (x i,z j ) E (xi,z j ) V x V z ( f x (x i ) f z (z j )) 2 where V x (resp. V z ) is a set of drugs (resp. target proteins), E is a set of interactions, and λ 1 and λ 2 are regularization parameters (Yamanishi, Adv Neural Inf Process Syst 21, 2009)
17 Extraction of multiple features Succesive features f x (q) and f z (q ) (q =1,2,,d) are obtained by " f x (q) f z (q) % ' = argmin (x i,z j ) E (x i,z j ) V x V z ( f x (x i ) f z (z j )) 2 + λ 1 f x 2 +λ 2 f z 2 ( f x (x i ) f z (z j )) 2 under the following orthogonality constraints: f x f x (1),, f x (q 1), f z f z (1),, f z (q 1) Prediction score for a given pair of compound x " and protein z " : g( x ", " d q =1 z ) = f (q) (q x ( x ") f ) z ( z ")
18 Algorithm By the representer theorem, features can be expanded as n x f x (x) = α j k x (x j, x), f z (z) = β j k z (z j, z) j=1 Kernel Gram matrices: n z j=1 (K x ) ij = k x (x i, x j ), i, j =1, 2,, n x (K z ) ij = k z (z i, z j ), i, j =1, 2,, n z Norms of features: f x 2 = α T K x α, f z 2 = β T K z β, where α = (α 1,,α nx ) T R n x, β = (β 1,, β n z ) T R n z
19 It is reduced to the generalized eigenvalue problem: K x D x K x + λ 1 K x K x AK z K x A T K x K z D z K z + λ 2 K z " % ' ' α β " % ' ' = ρ K x K z 2 " % ' ' α β " % ' ' The solution can be obtained by finding α q and β q which minimizes R(α, β) = α β! " % T K x D x K x K x AK z K z A T K x K z D z K z! " % α β! " % + λ 1α T K x α + λ 2 β T K z β α β! " % T K x K z 2! " % α β! " % under the following constraints: α T K x α 1 = = α T K x α q 1 = 0, β T K z β 1 = = β T K z β q 1 = 0 where D x (resp. D z ) : degree matrix of drugs (resp. target proteins), A : adjacency matrix of drug-target interactions
20 Drug-target interaction data for human: Gold standard data Statistics Number of drugs 1874 Number of target proteins (Total in human genome) Number of drug-target interactions 436 (23196) 6769 KEGG DRUG (December, 2011)
21 Cross-validation (CV) i) Pairwise CV (Missing interaction detection) ii) Blockwise CV I (new drug identification) Target protein i)????? D???? r??? u? g??????? ii) D r u g Target protein Training set? Test set iii) Blockwise CV II (new target identification) iii) D r u g Target protein Training set? Test set
22 Pharmacogenomic approach Genomic space etc. Target proteins Chemogenomics Pharmacogenomics etc. Drug chemical structures Chemical space etc. Phenotypic effects Pharmacological space
23 Pharmacogenomic approach Strategy: Phenotypically similar drugs are predicted to interact with simialr target proteins. (Campillos et al, Science, 2008; Yamamishi et al, Bioinformatics (ISMB2010), 2010; Atilas et al, J Comp Bio (RECOMB2010), 2011) Drug phenotypes can be represented by a profile of side-effects such as headache, hypertention, astriction, aortic stenosis, impotence, cardiac infarction, dyspnea, and many more.
24 Pharmacological similarity Each drug is represented by a profile y = (y 1, y 2,, y S ) T in which side-effect terms in the JAPIC database are coded as 1 or 0, respectively (S=17109). Pharmacological similarity: k phar (y i, y j ) = S w k y ik y k=1 jk S 2 w k y k=1 ik where w k = exp( d k 2 /σ 2 ), S 2 w k y k=1 jk, for i, j =1, 2,..., n y d k : frequency of the k-th side-effect
25 Chemical similarity vs. Pharmacological similarity Pharmacological effect similarity 1.0 Drugs targeting enzymes Chemical structure similarity
26 Chemical similarity vs Pharmacological similarity Chemical similarity Pharmacological similarity Chemical structure similarity Pharmacological effect similarity Interaction Not-interaction Interaction Not-interaction Interaction: drug pairs share the same target Non-interaction: drug pairs do not share the same target
27 Limitation of pharmacogenomic approach Problem: It is applicable only to marketed drugs for which side-effect information is available Proposed procedure: 1. Predict unknown pharmacological similarity from chemical structure similarity 2. Apply a bipartite graph inference method with pharmacological similarity for compounds and genomic sequence similarity for proteins
28 Step 1: Prediction of unknown pharmacological similarity from chemical structure similarity! Chemical similarity matrix: C = " A varient of regression model to predict missing parts: k y (y, y!) = f (k x (x, x!))+ε = u(x) T u( x!)+ε where u(x) = (u (1) (x),...,u (m) (x)) T : the underlying features C tt C pt! Pharmacological similarity matrix: P = " T C pt C pp where t : drugs with side-effect information, % P tt??? p : drugs with no side-effect information %
29 Step 2: Bipartite graph inference to predict unknown interactions A predictive model is learned based on pharmacological similarity for drugs genomic sequence similarity for targets proteins Prediction score for a given pair of compound y " and protein z " : g( y ", " d q =1 z ) = f (q) (q y ( y ") f ) z ( z ")
30 Comprehensive prediction of unknown drug-target interactions n Test drugs: all compounds in KEGG LIGAND and all drugs in KEGG DRUG n Test target proteins: all human proteins in KEGG GENES n All gold standard interaction data are used in the training Chemogenomic approach: 140 out of top 1000 predictions were confirmed in the literature Pharmacogenomic approach: 223 out of top 1000 predictions were confirmed in the literature
31 Conclusion n Drug-target interactions are more correlated with pharmacological similarity than with chemical structure similarity n The proposed method can predict unknown drugtarget interactions on a large scale the lack of need for 3D structure information of the target proteins the use of chemical, genomic, and pharmacological data in an integrated framework
32 Machine learning in chemoinformatics Lodhi, H. and Yamanishi, Y., Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques, IGI Global, 2010.
33 Acknowledgements n Curie Institute, Inserm U900, Mines ParisTech Jean-Philippe Vert Véronique Stoven Kevin Bleakley Edouard Pauwels n Kyoto University Minoru Kanehisa Susumu Goto Michihiro Araki Masaaki Kotera
Machine learning for ligand-based virtual screening and chemogenomics!
Machine learning for ligand-based virtual screening and chemogenomics! Jean-Philippe Vert Institut Curie - INSERM U900 - Mines ParisTech In silico discovery of molecular probes and drug-like compounds:
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationSupport Vector Machines: Kernels
Support Vector Machines: Kernels CS6780 Advanced Machine Learning Spring 2015 Thorsten Joachims Cornell University Reading: Murphy 14.1, 14.2, 14.4 Schoelkopf/Smola Chapter 7.4, 7.6, 7.8 Non-Linear Problems
More informationDrug-Target Interaction Prediction by Learning From Local Information and Neighbors
Bioinformatics Advance Access published November 7, 22 Drug-Target Interaction Prediction by Learning From Local Information and Neighbors Jian-Ping Mei, Chee-Keong Kwoh, Peng Yang, Xiao-Li Li,2, Jie Zheng
More informationCISC 636 Computational Biology & Bioinformatics (Fall 2016)
CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein
More informationSupport vector machines, Kernel methods, and Applications in bioinformatics
1 Support vector machines, Kernel methods, and Applications in bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group Machine Learning in Bioinformatics conference,
More informationLink Mining for Kernel-based Compound-Protein Interaction Predictions Using a Chemogenomics Approach
Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a Chemogenomics Approach Masahito Ohue 1,2,3,4*, akuro Yamazaki 3, omohiro Ban 4, and Yutaka Akiyama 1,2,3,4* 1 Department of
More informationFlaws in evaluation schemes for pair-input computational predictions
Supplementary Information Flaws in evaluation schemes for pair-input computational predictions Yungki Park 1 & Edward M. Marcotte 1 1 Center for Systems and Synthetic Biology, Institute of Cellular and
More informationOn learning with kernels for unordered pairs
Martial Hue MartialHue@mines-paristechfr Jean-Philippe Vert Jean-PhilippeVert@mines-paristechfr Mines ParisTech, Centre for Computational Biology, 35 rue Saint Honoré, F-77300 Fontainebleau, France Institut
More informationStatistical learning theory, Support vector machines, and Bioinformatics
1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.
More informationMetabolic networks: Activity detection and Inference
1 Metabolic networks: Activity detection and Inference Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group Advanced microarray analysis course, Elsinore, Denmark, May 21th,
More informationhsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference
CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science
More informationStatistical learning on graphs
Statistical learning on graphs Jean-Philippe Vert Jean-Philippe.Vert@ensmp.fr ParisTech, Ecole des Mines de Paris Institut Curie INSERM U900 Seminar of probabilities, Institut Joseph Fourier, Grenoble,
More informationA Least Squares Formulation for Canonical Correlation Analysis
A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation
More informationAbstract. 1 Introduction 1 INTRODUCTION 1
1 INTRODUCTION 1 Predicting drug-target interactions for new drug compounds using a weighted nearest neighbor profile Twan van Laarhoven 1, Elena Marchiori 1, 1 Institute for Computing and Information
More informationXia Ning,*, Huzefa Rangwala, and George Karypis
J. Chem. Inf. Model. XXXX, xxx, 000 A Multi-Assay-Based Structure-Activity Relationship Models: Improving Structure-Activity Relationship Models by Incorporating Activity Information from Related Targets
More informationReducing Multiclass to Binary: A Unifying Approach for Margin Classifiers
Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers Erin Allwein, Robert Schapire and Yoram Singer Journal of Machine Learning Research, 1:113-141, 000 CSE 54: Seminar on Learning
More informationEach new feature uses a pair of the original features. Problem: Mapping usually leads to the number of features blow up!
Feature Mapping Consider the following mapping φ for an example x = {x 1,...,x D } φ : x {x1,x 2 2,...,x 2 D,,x 2 1 x 2,x 1 x 2,...,x 1 x D,...,x D 1 x D } It s an example of a quadratic mapping Each new
More informationClassification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).
Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes
More informationCIS 520: Machine Learning Oct 09, Kernel Methods
CIS 520: Machine Learning Oct 09, 207 Kernel Methods Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture They may or may not cover all the material discussed
More informationNonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction Piyush Rai CS5350/6350: Machine Learning October 25, 2011 Recap: Linear Dimensionality Reduction Linear Dimensionality Reduction: Based on a linear projection of the
More informationMachine Learning Linear Models
Machine Learning Linear Models Outline II - Linear Models 1. Linear Regression (a) Linear regression: History (b) Linear regression with Least Squares (c) Matrix representation and Normal Equation Method
More informationExplainable AI that Can be Used for Judgment with Responsibility
Explain AI@Imperial Workshop Explainable AI that Can be Used for Judgment with Responsibility FUJITSU LABORATORIES LTD. 5th April 08 About me Name: Hajime Morita Research interests: Natural Language Processing
More informationPrediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines
Article Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines Yun-Fei Wang, Huan Chen, and Yan-Hong Zhou* Hubei Bioinformatics and Molecular Imaging Key Laboratory,
More informationKernel-based Machine Learning for Virtual Screening
Kernel-based Machine Learning for Virtual Screening Dipl.-Inf. Matthias Rupp Beilstein Endowed Chair for Chemoinformatics Johann Wolfgang Goethe-University Frankfurt am Main, Germany 2008-04-11, Helmholtz
More informationKernels for small molecules
Kernels for small molecules Günter Klambauer June 23, 2015 Contents 1 Citation and Reference 1 2 Graph kernels 2 2.1 Implementation............................ 2 2.2 The Spectrum Kernel........................
More informationBayesian Data Fusion with Gaussian Process Priors : An Application to Protein Fold Recognition
Bayesian Data Fusion with Gaussian Process Priors : An Application to Protein Fold Recognition Mar Girolami 1 Department of Computing Science University of Glasgow girolami@dcs.gla.ac.u 1 Introduction
More informationPackage Rchemcpp. August 14, 2018
Type Package Title Similarity measures for chemical compounds Version 2.19.0 Date 2015-06-23 Author Michael Mahr, Guenter Klambauer Package Rchemcpp August 14, 2018 Maintainer Guenter Klambauer
More informationEfficient Complex Output Prediction
Efficient Complex Output Prediction Florence d Alché-Buc Joint work with Romain Brault, Alex Lambert, Maxime Sangnier October 12, 2017 LTCI, Télécom ParisTech, Institut-Mines Télécom, Université Paris-Saclay
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationKernels and the Kernel Trick. Machine Learning Fall 2017
Kernels and the Kernel Trick Machine Learning Fall 2017 1 Support vector machines Training by maximizing margin The SVM objective Solving the SVM optimization problem Support vectors, duals and kernels
More informationStatistical Methods for SVM
Statistical Methods for SVM Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot,
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More informationKernel Methods. Outline
Kernel Methods Quang Nguyen University of Pittsburgh CS 3750, Fall 2011 Outline Motivation Examples Kernels Definitions Kernel trick Basic properties Mercer condition Constructing feature space Hilbert
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationUniversal Learning Technology: Support Vector Machines
Special Issue on Information Utilizing Technologies for Value Creation Universal Learning Technology: Support Vector Machines By Vladimir VAPNIK* This paper describes the Support Vector Machine (SVM) technology,
More informationKaggle.
Administrivia Mini-project 2 due April 7, in class implement multi-class reductions, naive bayes, kernel perceptron, multi-class logistic regression and two layer neural networks training set: Project
More informationSupport'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan
Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination
More informationSVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels
SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels Karl Stratos June 21, 2018 1 / 33 Tangent: Some Loose Ends in Logistic Regression Polynomial feature expansion in logistic
More informationCanonical Correlation Analysis with Kernels
Canonical Correlation Analysis with Kernels Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Computational Diagnostics Group Seminar 2003 Mar 10 1 Overview
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationOutline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space
to The The A s s in to Fabio A. González Ph.D. Depto. de Ing. de Sistemas e Industrial Universidad Nacional de Colombia, Bogotá April 2, 2009 to The The A s s in 1 Motivation Outline 2 The Mapping the
More informationFeature gene selection method based on logistic and correlation information entropy
Bio-Medical Materials and Engineering 26 (2015) S1953 S1959 DOI 10.3233/BME-151498 IOS Press S1953 Feature gene selection method based on logistic and correlation information entropy Jiucheng Xu a,b,,
More informationKernel Methods. Machine Learning A W VO
Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers Blaine Nelson, Tobias Scheffer Contents Classification Problem Bayesian Classifier Decision Linear Classifiers, MAP Models Logistic
More informationPerceptron Revisited: Linear Separators. Support Vector Machines
Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department
More informationAnalysis of N-terminal Acetylation data with Kernel-Based Clustering
Analysis of N-terminal Acetylation data with Kernel-Based Clustering Ying Liu Department of Computational Biology, School of Medicine University of Pittsburgh yil43@pitt.edu 1 Introduction N-terminal acetylation
More informationSupport Vector Machines
Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector
More informationLearning SVM Classifiers with Indefinite Kernels
Learning SVM Classifiers with Indefinite Kernels Suicheng Gu and Yuhong Guo Dept. of Computer and Information Sciences Temple University Support Vector Machines (SVMs) (Kernel) SVMs are widely used in
More informationChemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller
Chemogenomic: Approaches to Rational Drug Design Jonas Skjødt Møller Chemogenomic Chemistry Biology Chemical biology Medical chemistry Chemical genetics Chemoinformatics Bioinformatics Chemoproteomics
More informationMachine learning for pervasive systems Classification in high-dimensional spaces
Machine learning for pervasive systems Classification in high-dimensional spaces Department of Communications and Networking Aalto University, School of Electrical Engineering stephan.sigg@aalto.fi Version
More information6.036 midterm review. Wednesday, March 18, 15
6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semi-supervised learning some labels available - what algorithms have you learned that
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2018 CS 551, Fall
More informationMath for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han
Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationWe choose parameter values that will minimize the difference between the model outputs & the true function values.
CSE 4502/5717 Big Data Analytics Lecture #16, 4/2/2018 with Dr Sanguthevar Rajasekaran Notes from Yenhsiang Lai Machine learning is the task of inferring a function, eg, f : R " R This inference has to
More informationMotif Extraction and Protein Classification
Motif Extraction and Protein Classification Vered Kunik 1 Zach Solan 2 Shimon Edelman 3 Eytan Ruppin 1 David Horn 2 1 School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel {kunikver,ruppin}@tau.ac.il
More informationGraph Wavelets to Analyze Genomic Data with Biological Networks
Graph Wavelets to Analyze Genomic Data with Biological Networks Yunlong Jiao and Jean-Philippe Vert "Emerging Topics in Biological Networks and Systems Biology" symposium, Swedish Collegium for Advanced
More informationLearning with kernels and SVM
Learning with kernels and SVM Šámalova chata, 23. května, 2006 Petra Kudová Outline Introduction Binary classification Learning with Kernels Support Vector Machines Demo Conclusion Learning from data find
More informationCSC2545 Topics in Machine Learning: Kernel Methods and Support Vector Machines
CSC2545 Topics in Machine Learning: Kernel Methods and Support Vector Machines A comprehensive introduc@on to SVMs and other kernel methods, including theory, algorithms and applica@ons. Instructor: Anthony
More informationMultiple Kernel Learning
CS 678A Course Project Vivek Gupta, 1 Anurendra Kumar 2 Sup: Prof. Harish Karnick 1 1 Department of Computer Science and Engineering 2 Department of Electrical Engineering Indian Institute of Technology,
More informationKernel Methods in Machine Learning
Kernel Methods in Machine Learning Autumn 2015 Lecture 1: Introduction Juho Rousu ICS-E4030 Kernel Methods in Machine Learning 9. September, 2015 uho Rousu (ICS-E4030 Kernel Methods in Machine Learning)
More information8.1 Concentration inequality for Gaussian random matrix (cont d)
MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration
More informationMachine Learning Practice Page 2 of 2 10/28/13
Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes
More informationGene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm
Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm Zhenqiu Liu, Dechang Chen 2 Department of Computer Science Wayne State University, Market Street, Frederick, MD 273,
More information10-701/ Recitation : Kernels
10-701/15-781 Recitation : Kernels Manojit Nandi February 27, 2014 Outline Mathematical Theory Banach Space and Hilbert Spaces Kernels Commonly Used Kernels Kernel Theory One Weird Kernel Trick Representer
More informationComputational Biology From The Perspective Of A Physical Scientist
Computational Biology From The Perspective Of A Physical Scientist Dr. Arthur Dong PP1@TUM 26 November 2013 Bioinformatics Education Curriculum Math, Physics, Computer Science (Statistics and Programming)
More informationFACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION
SunLab Enlighten the World FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION Ioakeim (Kimis) Perros and Jimeng Sun perros@gatech.edu, jsun@cc.gatech.edu COMPUTATIONAL
More informationKernel Methods & Support Vector Machines
Kernel Methods & Support Vector Machines Mahdi pakdaman Naeini PhD Candidate, University of Tehran Senior Researcher, TOSAN Intelligent Data Miners Outline Motivation Introduction to pattern recognition
More informationi=1 cosn (x 2 i y2 i ) over RN R N. cos y sin x
Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 3 November 16, 017 Due: Dec 01, 017 A. Kernels Show that the following kernels K are PDS: 1.
More informationLearning Binary Classifiers for Multi-Class Problem
Research Memorandum No. 1010 September 28, 2006 Learning Binary Classifiers for Multi-Class Problem Shiro Ikeda The Institute of Statistical Mathematics 4-6-7 Minami-Azabu, Minato-ku, Tokyo, 106-8569,
More informationHYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH
HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH Hoang Trang 1, Tran Hoang Loc 1 1 Ho Chi Minh City University of Technology-VNU HCM, Ho Chi
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationA GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong
A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES Wei Chu, S. Sathiya Keerthi, Chong Jin Ong Control Division, Department of Mechanical Engineering, National University of Singapore 0 Kent Ridge Crescent,
More informationSVMs: nonlinearity through kernels
Non-separable data e-8. Support Vector Machines 8.. The Optimal Hyperplane Consider the following two datasets: SVMs: nonlinearity through kernels ER Chapter 3.4, e-8 (a) Few noisy data. (b) Nonlinearly
More informationDesigning Kernel Functions Using the Karhunen-Loève Expansion
July 7, 2004. Designing Kernel Functions Using the Karhunen-Loève Expansion 2 1 Fraunhofer FIRST, Germany Tokyo Institute of Technology, Japan 1,2 2 Masashi Sugiyama and Hidemitsu Ogawa Learning with Kernels
More informationLearning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht
Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht Computer Science Department University of Pittsburgh Outline Introduction Learning with
More informationRchemcpp Similarity measures for chemical compounds. Michael Mahr and Günter Klambauer. Institute of Bioinformatics, Johannes Kepler University Linz
Software Manual Institute of Bioinformatics, Johannes Kepler University Linz Rchemcpp Similarity measures for chemical compounds Michael Mahr and Günter Klambauer Institute of Bioinformatics, Johannes
More informationComputational Genomics. Reconstructing dynamic regulatory networks in multiple species
02-710 Computational Genomics Reconstructing dynamic regulatory networks in multiple species Methods for reconstructing networks in cells CRH1 SLT2 SLR3 YPS3 YPS1 Amit et al Science 2009 Pe er et al Recomb
More informationSupport Vector Machines
Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find a plane that separates the classes in feature space. If we cannot, we get creative in two
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationPredicting Protein Functions and Domain Interactions from Protein Interactions
Predicting Protein Functions and Domain Interactions from Protein Interactions Fengzhu Sun, PhD Center for Computational and Experimental Genomics University of Southern California Outline High-throughput
More informationarxiv: v1 [q-bio.qm] 21 Nov 2018
Structure-Based Networks for Drug Validation Cătălina Cangea 1, Arturas Grauslys 2, Pietro Liò 1 and Francesco Falciani 2 1 University of Cambridge 2 University of Liverpool arxiv:1811.09714v1 [q-bio.qm]
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationA short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie
A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie Computational Biology Program Memorial Sloan-Kettering Cancer Center http://cbio.mskcc.org/leslielab
More informationSupport Vector Machines.
Support Vector Machines www.cs.wisc.edu/~dpage 1 Goals for the lecture you should understand the following concepts the margin slack variables the linear support vector machine nonlinear SVMs the kernel
More informationspecies, if their corresponding mrnas share similar expression patterns, or if the proteins interact with one another. It seems natural that, while al
KERNEL-BASED DATA FUSION AND ITS APPLICATION TO PROTEIN FUNCTION PREDICTION IN YEAST GERT R. G. LANCKRIET Division of Electrical Engineering, University of California, Berkeley MINGHUA DENG Department
More informationStructural interpretation of QSAR models a universal approach
Methods and Applications of Computational Chemistry - 5 Kharkiv, Ukraine, 1 5 July 2013 Structural interpretation of QSAR models a universal approach Victor Kuz min, Pavel Polishchuk, Anatoly Artemenko,
More informationPredicting Unknown Interactions Between Known Drugs and Targets via Matrix Completion
Predicting Unknown Interactions Between Known Drugs and Targets via Matrix Completion Qing Liao 1(B), Naiyang Guan 2, Chengkun Wu 2, and Qian Zhang 1 1 Department of Computer Science and Engineering, Hong
More informationA Linear Programming Approach for Molecular QSAR analysis
Linear Programming pproach for Molecular QSR analysis Hiroto Saigo 1, Tadashi Kadowaki 2, and Koji Tsuda 1 1 Max Planck Institute for iological Cybernetics, Spemannstr. 38, 72076 Tübingen, Germany {hiroto.saigo,koji.tsuda}@tuebingen.mpg.de,
More informationStatistical Methods for Data Mining
Statistical Methods for Data Mining Kuangnan Fang Xiamen University Email: xmufkn@xmu.edu.cn Support Vector Machines Here we approach the two-class classification problem in a direct way: We try and find
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationPattern Recognition and Machine Learning. Perceptrons and Support Vector machines
Pattern Recognition and Machine Learning James L. Crowley ENSIMAG 3 - MMIS Fall Semester 2016 Lessons 6 10 Jan 2017 Outline Perceptrons and Support Vector machines Notation... 2 Perceptrons... 3 History...3
More informationMachine Learning and Data Mining. Support Vector Machines. Kalev Kask
Machine Learning and Data Mining Support Vector Machines Kalev Kask Linear classifiers Which decision boundary is better? Both have zero training error (perfect training accuracy) But, one of them seems
More informationPrediction in kernelized output spaces: output kernel trees and ensemble methods
Prediction in kernelized output spaces: output kernel trees and ensemble methods Pierre Geurts Florence d Alché Buc IBISC CNRS, Université d Evry, GENOPOLE, Evry, France Department of EE and CS, University
More informationLogistic Regression. COMP 527 Danushka Bollegala
Logistic Regression COMP 527 Danushka Bollegala Binary Classification Given an instance x we must classify it to either positive (1) or negative (0) class We can use {1,-1} instead of {1,0} but we will
More informationBasis Expansion and Nonlinear SVM. Kai Yu
Basis Expansion and Nonlinear SVM Kai Yu Linear Classifiers f(x) =w > x + b z(x) = sign(f(x)) Help to learn more general cases, e.g., nonlinear models 8/7/12 2 Nonlinear Classifiers via Basis Expansion
More informationSupport Vector Machines
Support Vector Machines Stephan Dreiseitl University of Applied Sciences Upper Austria at Hagenberg Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Overview Motivation
More information