Kernel-based Machine Learning for Virtual Screening

Size: px

Start display at page:

Download "Kernel-based Machine Learning for Virtual Screening"

Lesley Snow
5 years ago
Views:

1 Kernel-based Machine Learning for Virtual Screening Dipl.-Inf. Matthias Rupp Beilstein Endowed Chair for Chemoinformatics Johann Wolfgang Goethe-University Frankfurt am Main, Germany , Helmholtz Center, Munich

2 2 Outline Virtual screening Representation Methods Application Setting, definition, aspects Descriptors, graphs, shape, densities Gaussian process regression, novelty detection Virtual screening for PPARγ agonists

3 3 Virtual screening: Drug development Disease Target Screening Optimization Preclinical Clinical Phases I, II, III Market authorization Clinical Phase IV

4 Virtual screening: Drug development Disease Systematic testing of compounds for activity Target Biochemical assay High-throughput screening Screening

4 4 Virtual screening: Drug development Disease Systematic testing of compounds for activity Target Biochemical assay High-throughput screening Screening Virtual screening Optimization Receptor-based versus ligand-based Preclinical Clinical Phases I, II, III Market authorization Clinical Phase IV COX-2 Celecoxib

5 5 Virtual screening: Ligand-based approach Input: Known ligands (training samples) Compound library (test samples) Output: Molecules with best predicted activity Particularities Small training sets (10 1 to 10 3 ) Large test sets (10 5 to 10 6 ) False positives worse than false negatives Only top predictions are of interest Available binding activity information varies Key questions How to represent (and compare) molecules? How to learn from the training data?

$chain: 7 Rotatable bonds: 4 Negative partial charge surface fraction: 0.13 Hydrogen bond acceptors: 1... Figure courtesy Dr. Michael Schmuker M. Rupp, G. Schneider, P.$

6 Representation: Descriptors Computable properties in vector form Most frequently used representation Comparison by metric, inner product or similarity coefficient 1-pentyl acetate Bonds in longest chain: 7 Rotatable bonds: 4 Negative partial charge surface fraction: 0.13 Hydrogen bond acceptors: 1... Figure courtesy Dr. Michael Schmuker M. Rupp, G. Schneider, P. Schneider: Distance phenomena in high-dimensional chemical descriptor spaces: consequences for similarity-based approaches, in preparation,

7 Representation: Descriptors Computable properties in vector form Most frequently used representation Comparison by metric, inner product or similarity coefficient Alternatives: Structured data representations Graph models (structure graph) Surface models (molecular shape) Density models (spatial distribution)... M. Rupp, G. Schneider, P. Schneider: Distance phenomena in high-dimensional chemical descriptor spaces: consequences for similarity-based approaches, in preparation,

8 Representation: ISOAK Iterative similarity optimal assignment graph kernel Iterative graph similarity V V matrix X of pairwise vertex similarities Two vertices are similar if their neighbours are similar Recursive definition; iterative computation X i,j = (1 α)k v (v i, v j 1 )+α max π v j v n(v i ) X v,π(v) k e ( {vi, v}, {v j, π(v)} ) Optimal assignment Find assignment ρ : V V such that V i=1 X i,ρ(i) is maximal M. Rupp, E. Proschak, G. Schneider: Kernel Approach to Molecular Similarity Based on Iterative Graph Similarity, Journal of Chemical Information and Modeling 47(6): ,

Representation: ISOAK example ISOAK with α = 1 2, Dirac vertex kernel using element types and Dirac edge kernel using bond types. Overall similarity is 4.64/ 5 7 = 0.78.

9 Representation: ISOAK example ISOAK with α = 1 2, Dirac vertex kernel using element types and Dirac edge kernel using bond types. Overall similarity is 4.64/ 5 7 = X ij Pairwise atom similarities Glycine Serine M. Rupp, E. Proschak, G. Schneider: Kernel Approach to Molecular Similarity Based on Iterative Graph Similarity, Journal of Chemical Information and Modeling 47(6): ,

10 10 Methods: Kernel-based machine learning Linear algorithms and the kernel trick 1. Transformation into higher-dimensional space x not linearly separable 2. Implicit computation of inner products 3. Rewrite linear algorithms using only inner products

11 11 Methods: Kernel-based machine learning Linear algorithms and the kernel trick 1. Transformation into higher-dimensional space x x ( x, sin(x) ) not linearly separable linearly separable 2. Implicit computation of inner products 3. Rewrite linear algorithms using only inner products

12 Methods: Kernel-based machine learning Linear algorithms and the kernel trick 1. Transformation into higher-dimensional space 2. Implicit computation of inner products kernel k : X X R, k(x, x ) = Φ(x), Φ(x ) Example: Quadratic kernel Φ : R n R n2, x (x i x j ) n i,j=1 k(x, x ) = Φ(x), Φ(x ) n n n = x i x j x i x j = x i x i x j x j = x, x 2 i,j=1 i=1 j=1 3. Rewrite linear algorithms using only inner products 12

13 13 Methods: Kernel-based machine learning Linear algorithms and the kernel trick 1. Transformation into higher-dimensional space 2. Implicit computation of inner products 3. Rewrite linear algorithms using only inner products Example: Centering in feature space H k (x, x ) = Φ(x) 1 n n Φ(x i ), Φ(x ) 1 n i=1 = Φ(x), Φ(x ) 1 n 1 n i=1 = k(x, x ) 1 n n Φ(x i ) i=1 n Φ(x i ), Φ(x ) i=1 n Φ(x), Φ(x i ) + 1 n 2 n k(x i, x ) 1 n i=1 n i,j=1 Φ(x i ), Φ(x j ) n k(x, x i ) + 1 n 2 i=1 n i,j=1 k(x i, x j )

14 14 Methods: Gaussian process regression Gaussian process as data model Generalization of multivariate normal distribution to functions Determined by mean and covariance Kernel matrix as covariance matrix Conditioning of prior on training data yields posterior distribution Variance as confidence estimates for predictions target input target input

15 15 Methods: Principle component analysis novelty detection Orthogonal directions of maximum variance Dimensionality reduction Descriptive statistic

16 16 Methods: Principle component analysis novelty detection Orthogonal directions of maximum variance Dimensionality reduction Descriptive statistic

17 17 Methods: Principle component analysis novelty detection Orthogonal directions of maximum variance Dimensionality reduction Descriptive statistic Non-linear variants recover underlying Riemannian manifolds

18 18 Methods: Principle component analysis novelty detection Orthogonal directions of maximum variance Dimensionality reduction Descriptive statistic Non-linear variants recover underlying Riemannian manifolds

19 19 Methods: Principle component analysis novelty detection Orthogonal directions of maximum variance Dimensionality reduction Descriptive statistic Non-linear variants recover underlying Riemannian manifolds Novelty detection via projection error

20 20 Application: Material and methods Target: PPARγ (peroxisome proliferator-activated receptor γ) Dataset: 144 published ligands with pk i values Screening library: Asinex Gold and Platinum ( cpds.) Representation: Vectorial (CATS2D, MOE 2D, Ghose-Crippen fragments) ISOAK molecular graph kernel Method: Gaussian process regression Multiple kernel learning Leave-one-cluster-out cross-validation Fraction of actives (FA20 ) as success measure T. Schroeter, M. Rupp, K. Hansen, E. Proschak, K.-R. Müller, G. Schneider: Virtual screening for PPARγ ligands using ISOAK molecular graph kernel and Gaussian processes, 4th German Conference on Chemoinformatics, 2008.

21 Application: Results Top 30 of three best performing models 16 cherry-picked compounds with novel scaffolds PPARγ selective activator (EC ± 0.3 µm), natural product related 3 dual PPARα/γ activators (µm range, two 10µM) 4 selective PPARα activators (µm range, one 10µM) 8 out of 16 compounds are active 4 out of 16 compounds with EC 50 10µM Results preliminary since testing is still on-going M. Rupp, T. Schroeter, R. Steri, E. Proschak, K. Hansen, O. Rau, M. Schubert- Zsilavecz, K.-R. Müller, G. Schneider, in preparation,

22 22 Summary Virtual screening as a machine learning problem Importance of molecular representation Virtual screening using only positive samples

23 23 Acknowledgements Prof. Dr. Gisbert Schneider and modlab team (molecular design laboratory, Prof. Dr. Klaus Robert-Müller, Timon Schroeter, Katja Hansen (TU Berlin and Fraunhofer FIRST) Prof. Dr. Manfred Schubert-Zsilavecz, Ramona Steri (University of Frankfurt) Beilstein-Institute for the advancement of chemical sciences FIRST (Frankfurt international research graduate school on translational biomedicine) Thank you for your attention

Introduction to Chemoinformatics and Drug Discovery

Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013 The Chemical Space There are atoms and space. Everything else is opinion. Democritus (ca.