Machine learning methods to infer drug-target interaction network

Similar documents
Machine learning for ligand-based virtual screening and chemogenomics!

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Support Vector Machines: Kernels

Drug-Target Interaction Prediction by Learning From Local Information and Neighbors

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

Support vector machines, Kernel methods, and Applications in bioinformatics

Link Mining for Kernel-based Compound-Protein Interaction Predictions Using a Chemogenomics Approach

Flaws in evaluation schemes for pair-input computational predictions

On learning with kernels for unordered pairs

Statistical learning theory, Support vector machines, and Bioinformatics

Metabolic networks: Activity detection and Inference

hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference

Statistical learning on graphs

A Least Squares Formulation for Canonical Correlation Analysis

Abstract. 1 Introduction 1 INTRODUCTION 1

Xia Ning,*, Huzefa Rangwala, and George Karypis

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

Each new feature uses a pair of the original features. Problem: Mapping usually leads to the number of features blow up!

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

CIS 520: Machine Learning Oct 09, Kernel Methods

Nonlinear Dimensionality Reduction

Machine Learning Linear Models

Explainable AI that Can be Used for Judgment with Responsibility

Prediction and Classif ication of Human G-protein Coupled Receptors Based on Support Vector Machines

Kernel-based Machine Learning for Virtual Screening

Kernels for small molecules

Bayesian Data Fusion with Gaussian Process Priors : An Application to Protein Fold Recognition

Package Rchemcpp. August 14, 2018

Efficient Complex Output Prediction

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

Kernels and the Kernel Trick. Machine Learning Fall 2017

Statistical Methods for SVM

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

Kernel Methods. Outline

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Universal Learning Technology: Support Vector Machines

Kaggle.

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

SVMs: Non-Separable Data, Convex Surrogate Loss, Multi-Class Classification, Kernels

Canonical Correlation Analysis with Kernels

ECS289: Scalable Machine Learning

Introduction to Machine Learning

Outline. Motivation. Mapping the input space to the feature space Calculating the dot product in the feature space

Feature gene selection method based on logistic and correlation information entropy

Kernel Methods. Machine Learning A W VO

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Linear Classifiers. Blaine Nelson, Tobias Scheffer

Perceptron Revisited: Linear Separators. Support Vector Machines

Analysis of N-terminal Acetylation data with Kernel-Based Clustering

Support Vector Machines

Learning SVM Classifiers with Indefinite Kernels

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Machine learning for pervasive systems Classification in high-dimensional spaces

6.036 midterm review. Wednesday, March 18, 15

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han

Linear & nonlinear classifiers

We choose parameter values that will minimize the difference between the model outputs & the true function values.

Motif Extraction and Protein Classification

Graph Wavelets to Analyze Genomic Data with Biological Networks

Learning with kernels and SVM

CSC2545 Topics in Machine Learning: Kernel Methods and Support Vector Machines

Multiple Kernel Learning

Kernel Methods in Machine Learning

8.1 Concentration inequality for Gaussian random matrix (cont d)

Machine Learning Practice Page 2 of 2 10/28/13

Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm

10-701/ Recitation : Kernels

Computational Biology From The Perspective Of A Physical Scientist

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION

Kernel Methods & Support Vector Machines

i=1 cosn (x 2 i y2 i ) over RN R N. cos y sin x

Learning Binary Classifiers for Multi-Class Problem

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machine (SVM) and Kernel Methods

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong

SVMs: nonlinearity through kernels

Designing Kernel Functions Using the Karhunen-Loève Expansion

Learning Classification with Auxiliary Probabilistic Information Quang Nguyen Hamed Valizadegan Milos Hauskrecht

Rchemcpp Similarity measures for chemical compounds. Michael Mahr and Günter Klambauer. Institute of Bioinformatics, Johannes Kepler University Linz

Computational Genomics. Reconstructing dynamic regulatory networks in multiple species

Support Vector Machines

Association studies and regression

Predicting Protein Functions and Domain Interactions from Protein Interactions

arxiv: v1 [q-bio.qm] 21 Nov 2018

Statistical Pattern Recognition

A short introduction to supervised learning, with applications to cancer pathway analysis Dr. Christina Leslie

Support Vector Machines.

species, if their corresponding mrnas share similar expression patterns, or if the proteins interact with one another. It seems natural that, while al

Structural interpretation of QSAR models a universal approach

Predicting Unknown Interactions Between Known Drugs and Targets via Matrix Completion

A Linear Programming Approach for Molecular QSAR analysis

Statistical Methods for Data Mining

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Pattern Recognition and Machine Learning. Perceptrons and Support Vector machines

Machine Learning and Data Mining. Support Vector Machines. Kalev Kask

Prediction in kernelized output spaces: output kernel trees and ensemble methods

Logistic Regression. COMP 527 Danushka Bollegala

Basis Expansion and Nonlinear SVM. Kai Yu

Support Vector Machines

Transcription:

Machine learning methods to infer drug-target interaction network Yoshihiro Yamanishi Medical Institute of Bioregulation Kyushu University

Outline n Background Drug-target interaction network Chemical, genomic, and pharmacological spaces n Methods to predict drug-target interactions Chemogenomic approach Pharmacogenomic approach Algorithm n Results n Concluding remarks

Drug-target interaction Phenotypic effect Drug interaction Primary target protein Off-target protein Efficacy Side-effect Phenotypic scale Identification of interactions between drugs and target proteins is crucial in the drug development

Possible data sources Genomic space etc. Target proteins etc. Drug chemical structures Chemical space etc. Phenotypic effects Pharmacological space

Objective n Prediction of unknown drug-target interactions on a large scale from chemical, genomic and pharmacological data Known interactions 1 1 2 Drug 3 4 2 3 Target protein 5 4 6 5 Unknown interactions (to be predicted)

Chemogenomic approach Genomic space etc. Target proteins Chemogenomics Pharmacogenomics etc. Drug chemical structures Chemical space etc. Phenotypic effects Pharmacological space

Examples of the data structures Drug Target protein MAHAAQVGLQDATSPIMEELITFHDHALMIIFLICFLVLYA LFLTLTTKLTNTNISDAQEMETVWTILPAIILVLIALPSLRIL YMTDEVNDPSLTIKSIGHQWYWTYEYTDYGGLIFNSYML PPLFLEPGDLRLLDVDNRVVLPIEAPIRMMITSQDVLHSW AVPTLGLKTDAIPGRLNQTTFTATRPGVYYGQCSEICGAN HSFMPIVLELIPLKIFEMGPVFTL Chemical graph structure Amino acid sequence

Chemogenomic approach Strategy: Chemically similar drugs are predicted to interact with simlar target proteins (Yamanishi et al, Bioifnormatics, 2008; Faulon et al., Bioinformatics, 2008; Jacob et al, Bioinformatics, 2008, Yabuuchi et al, Mol Sys Bio, 2011) Chemical structure similarity for drugs is evaluated by a graph kernel: (K x ) ij = k x (x i, x j ) for i, j =1, 2,..., n x (Mahe et al, J Chem Inf Model, 2005) Genomic sequence similarity for target proteins is evaluated by a string kernel: (K z ) ij = k x (z i, z j ) for i, j =1, 2,..., n z (Saigo et al, Bioinformatics, 2004)

Binary classification approach n Classification of drug-target pairs into the interaction class or non-interaction class Support Vector Machine (SVM) with pairwise kernels Faulon et al., Bioinformatics, 24:225-233, 2008 Jacob and Vert, Bioinformatics, 24:2149-2156, 2008 Yabuuchi et al, Mol Sys Bio, 2011

SVM with pairwise kernels (pairwise SVM) ordinary SVM : n i=1 f (x) = a i k(x i, x ") + b where x : an object pairwise SVM : n x n z i=1 f (x,z) = a i k((x,z) i,( x ", z ")) + b where (x,z) : a compound protein pair

Supervised bipartie graph inference Chemical space Interaction space Genomic space known drug Compounds with similar structures are close to each other known interaction Interacting drugs and targets are connected on the graph known target Proteins with similar sequences are close to each other

Step 1: embedding drugs and targets on the known graph into a unified feature space R d Chemical space Interaction space Genomic space known drug Compounds with similar structures are close to each other known interaction Interacting drugs and targets are close to each other known target Proteins with similar sequences are close to each other

Step 2. Learning a model between the chemical/ genomic space and the interaction space Chemical space Interaction space Genomic space f x f z known drug Compounds with similar structures are close to each other known interaction Interacting drugs and targets are close to each other known target Proteins with similar sequences are close to each other

Step 3. Predicting unknown interactions involving new compounds/proteins after the projection Chemical space Interaction space Genomic space f x f z new compound New compounds are mapped with fx new protein New proteins are mapped with fz

Step 3. Predicting unknown interactions involving new compounds/proteins after the projection Chemical space Interaction space Genomic space f x f z new compound New compounds are mapped with fx predicted interaction Connect compound-protein pairs which are closer than a threshold new protein New proteins are mapped with fz

Mappting to a unified space Let us consider two functions to map each compound x and each protein z onto a unified Euclidian space f x (x) = ( f (1) (d % x (x),, f ) x (x)) T R d f z (z) = ( f (1) (d % z (z),, f ) z (z)) T R d We find f x and f z which minimize R( f x, f z ) = ( f x (x i ) f z (z j )) 2 + λ 1 f x 2 +λ 2 f z 2 (x i,z j ) E (xi,z j ) V x V z ( f x (x i ) f z (z j )) 2 where V x (resp. V z ) is a set of drugs (resp. target proteins), E is a set of interactions, and λ 1 and λ 2 are regularization parameters (Yamanishi, Adv Neural Inf Process Syst 21, 2009)

Extraction of multiple features Succesive features f x (q) and f z (q ) (q =1,2,,d) are obtained by " f x (q) f z (q) % ' = argmin (x i,z j ) E (x i,z j ) V x V z ( f x (x i ) f z (z j )) 2 + λ 1 f x 2 +λ 2 f z 2 ( f x (x i ) f z (z j )) 2 under the following orthogonality constraints: f x f x (1),, f x (q 1), f z f z (1),, f z (q 1) Prediction score for a given pair of compound x " and protein z " : g( x ", " d q =1 z ) = f (q) (q x ( x ") f ) z ( z ")

Algorithm By the representer theorem, features can be expanded as n x f x (x) = α j k x (x j, x), f z (z) = β j k z (z j, z) j=1 Kernel Gram matrices: n z j=1 (K x ) ij = k x (x i, x j ), i, j =1, 2,, n x (K z ) ij = k z (z i, z j ), i, j =1, 2,, n z Norms of features: f x 2 = α T K x α, f z 2 = β T K z β, where α = (α 1,,α nx ) T R n x, β = (β 1,, β n z ) T R n z

It is reduced to the generalized eigenvalue problem: K x D x K x + λ 1 K x K x AK z K x A T K x K z D z K z + λ 2 K z " % ' ' α β " % ' ' = ρ K x 2 0 0 K z 2 " % ' ' α β " % ' ' The solution can be obtained by finding α q and β q which minimizes R(α, β) = α β! " % T K x D x K x K x AK z K z A T K x K z D z K z! " % α β! " % + λ 1α T K x α + λ 2 β T K z β α β! " % T K x 2 0 0 K z 2! " % α β! " % under the following constraints: α T K x α 1 = = α T K x α q 1 = 0, β T K z β 1 = = β T K z β q 1 = 0 where D x (resp. D z ) : degree matrix of drugs (resp. target proteins), A : adjacency matrix of drug-target interactions

Drug-target interaction data for human: Gold standard data Statistics Number of drugs 1874 Number of target proteins (Total in human genome) Number of drug-target interactions 436 (23196) 6769 KEGG DRUG (December, 2011)

Cross-validation (CV) i) Pairwise CV (Missing interaction detection) ii) Blockwise CV I (new drug identification) Target protein i)????? D???? r??? u? g??????? ii) D r u g Target protein Training set? Test set iii) Blockwise CV II (new target identification) iii) D r u g Target protein Training set? Test set

Pharmacogenomic approach Genomic space etc. Target proteins Chemogenomics Pharmacogenomics etc. Drug chemical structures Chemical space etc. Phenotypic effects Pharmacological space

Pharmacogenomic approach Strategy: Phenotypically similar drugs are predicted to interact with simialr target proteins. (Campillos et al, Science, 2008; Yamamishi et al, Bioinformatics (ISMB2010), 2010; Atilas et al, J Comp Bio (RECOMB2010), 2011) Drug phenotypes can be represented by a profile of side-effects such as headache, hypertention, astriction, aortic stenosis, impotence, cardiac infarction, dyspnea, and many more.

Pharmacological similarity Each drug is represented by a profile y = (y 1, y 2,, y S ) T in which 17109 side-effect terms in the JAPIC database are coded as 1 or 0, respectively (S=17109). Pharmacological similarity: k phar (y i, y j ) = S w k y ik y k=1 jk S 2 w k y k=1 ik where w k = exp( d k 2 /σ 2 ), S 2 w k y k=1 jk, for i, j =1, 2,..., n y d k : frequency of the k-th side-effect

Chemical similarity vs. Pharmacological similarity 0.8 0.6 0.4 0.2 0.0 Pharmacological effect similarity 1.0 Drugs targeting enzymes 0.0 0.2 0.4 0.6 Chemical structure similarity 0.8 1.0

Chemical similarity vs Pharmacological similarity Chemical similarity Pharmacological similarity Chemical structure similarity 0.0 0.2 0.4 0.6 0.8 1.0 Pharmacological effect similarity 0.0 0.2 0.4 0.6 0.8 1.0 Interaction Not-interaction Interaction Not-interaction Interaction: drug pairs share the same target Non-interaction: drug pairs do not share the same target

Limitation of pharmacogenomic approach Problem: It is applicable only to marketed drugs for which side-effect information is available Proposed procedure: 1. Predict unknown pharmacological similarity from chemical structure similarity 2. Apply a bipartite graph inference method with pharmacological similarity for compounds and genomic sequence similarity for proteins

Step 1: Prediction of unknown pharmacological similarity from chemical structure similarity! Chemical similarity matrix: C = " A varient of regression model to predict missing parts: k y (y, y!) = f (k x (x, x!))+ε = u(x) T u( x!)+ε where u(x) = (u (1) (x),...,u (m) (x)) T : the underlying features C tt C pt! Pharmacological similarity matrix: P = " T C pt C pp where t : drugs with side-effect information, % P tt??? p : drugs with no side-effect information %

Step 2: Bipartite graph inference to predict unknown interactions A predictive model is learned based on pharmacological similarity for drugs genomic sequence similarity for targets proteins Prediction score for a given pair of compound y " and protein z " : g( y ", " d q =1 z ) = f (q) (q y ( y ") f ) z ( z ")

Comprehensive prediction of unknown drug-target interactions n Test drugs: all compounds in KEGG LIGAND and all drugs in KEGG DRUG n Test target proteins: all human proteins in KEGG GENES n All gold standard interaction data are used in the training Chemogenomic approach: 140 out of top 1000 predictions were confirmed in the literature Pharmacogenomic approach: 223 out of top 1000 predictions were confirmed in the literature

Conclusion n Drug-target interactions are more correlated with pharmacological similarity than with chemical structure similarity n The proposed method can predict unknown drugtarget interactions on a large scale the lack of need for 3D structure information of the target proteins the use of chemical, genomic, and pharmacological data in an integrated framework

Machine learning in chemoinformatics Lodhi, H. and Yamanishi, Y., Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques, IGI Global, 2010.

Acknowledgements n Curie Institute, Inserm U900, Mines ParisTech Jean-Philippe Vert Véronique Stoven Kevin Bleakley Edouard Pauwels n Kyoto University Minoru Kanehisa Susumu Goto Michihiro Araki Masaaki Kotera