Virtual screening in drug discovery

Similar documents
Structural biology and drug design: An overview

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Introduction. OntoChem

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Chemical library design

Docking. GBCB 5874: Problem Solving in GBCB

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Biologically Relevant Molecular Comparisons. Mark Mackey

Hit Finding and Optimization Using BLAZE & FORGE

Structure-Activity Modeling - QSAR. Uwe Koch

Ignasi Belda, PhD CEO. HPC Advisory Council Spain Conference 2015

In silico pharmacology for drug discovery

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

Similarity Search. Uwe Koch

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Structure based drug design and LIE models for GPCRs

Protein Structure Prediction and Protein-Ligand Docking

Using AutoDock for Virtual Screening

De Novo molecular design with Deep Reinforcement Learning

The PhilOEsophy. There are only two fundamental molecular descriptors

Progress of Compound Library Design Using In-silico Approach for Collaborative Drug Discovery

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

Interpretation of QSAR models

Softwares for Molecular Docking. Lokesh P. Tripathi NCBS 17 December 2007

Kd = koff/kon = [R][L]/[RL]

Virtual Screening: How Are We Doing?

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

Introduction to Chemoinformatics and Drug Discovery

What is Protein-Ligand Docking?

Data Quality Issues That Can Impact Drug Discovery

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

Exploring the black box: structural and functional interpretation of QSAR models.

Plan. Day 2: Exercise on MHC molecules.

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Ligand-receptor interactions

Design and Synthesis of the Comprehensive Fragment Library

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Cheminformatics analysis and learning in a data pipelining environment

Structural Bioinformatics (C3210) Molecular Docking

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

QSAR/QSPR modeling. Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships

In Silico Investigation of Off-Target Effects

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Development of Pharmacophore Model for Indeno[1,2-b]indoles as Human Protein Kinase CK2 Inhibitors and Database Mining

Protein-Ligand Docking Evaluations

Receptor Based Drug Design (1)

Molecular Dynamics Graphical Visualization 3-D QSAR Pharmacophore QSAR, COMBINE, Scoring Functions, Homology Modeling,..

Cheminformatics platform for drug discovery application

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

An Integrated Approach to in-silico

Quantification of free ligand conformational preferences by NMR and their relationship to the bioactive conformation

DOCKING TUTORIAL. A. The docking Workflow

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012

Targeting protein-protein interactions: A hot topic in drug discovery

High Throughput In-Silico Screening Against Flexible Protein Receptors

Bridging the Dimensions:

Author Index Volume

Medicinal Chemistry/ CHEM 458/658 Chapter 4- Computer-Aided Drug Design

Dispensing Processes Profoundly Impact Biological, Computational and Statistical Analyses

COMPARISON OF SIMILARITY METHOD TO IMPROVE RETRIEVAL PERFORMANCE FOR CHEMICAL DATA

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

BUDE. A General Purpose Molecular Docking Program Using OpenCL. Richard B Sessions

Early Stages of Drug Discovery in the Pharmaceutical Industry

Towards Physics-based Models for ADME/Tox. Tyler Day

Principles of Drug Design

Using Bayesian Statistics to Predict Water Affinity and Behavior in Protein Binding Sites. J. Andrew Surface

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Advanced in silico Drug Design KFC/ADD

Pharmacophore Approaches In Drug Discovery

The Schrödinger KNIME extensions

TRAINING REAXYS MEDICINAL CHEMISTRY

Notes of Dr. Anil Mishra at 1

Advanced Medicinal Chemistry SLIDES B

Machine Learning Concepts in Chemoinformatics

est Drive K20 GPUs! Experience The Acceleration Run Computational Chemistry Codes on Tesla K20 GPU today

Different conformations of the drugs within the virtual library of FDA approved drugs will be generated.

Studying the effect of noise on Laplacian-modified Bayesian Analysis and Tanimoto Similarity

User Guide for LeDock

Development and application of ligand-based computational methods for de-novo drug. design and virtual screening. Alexander Richard Geanes.

Data Mining in the Chemical Industry. Overview of presentation

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

Ping-Chiang Lyu. Institute of Bioinformatics and Structural Biology, Department of Life Science, National Tsing Hua University.

An Enhancement of Bayesian Inference Network for Ligand-Based Virtual Screening using Features Selection

Kinome-wide Activity Models from Diverse High-Quality Datasets

Principles of Drug Design

Structural interpretation of QSAR models a universal approach

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

Integrated Cheminformatics to Guide Drug Discovery

October 6 University Faculty of pharmacy Computer Aided Drug Design Unit

Conformational Sampling of Druglike Molecules with MOE and Catalyst: Implications for Pharmacophore Modeling and Virtual Screening

Performing a Pharmacophore Search using CSD-CrossMiner

5.1. Hardwares, Softwares and Web server used in Molecular modeling

A COMPARATIVE STUDY OF MACHINE-LEARNING-BASED SCORING FUNCTIONS IN PREDICTING PROTEIN-LIGAND BINDING AFFINITY. Hossam Mohamed Farg Ashtawy A THESIS

Drug Informatics for Chemical Genomics...

MD Simulation in Pose Refinement and Scoring Using AMBER Workflows

11. IN SILICO DOCKING ANALYSIS

Universities of Leeds, Sheffield and York

Transcription:

Virtual screening in drug discovery Pavel Polishchuk Institute of Molecular and Translational Medicine Palacky University pavlo.polishchuk@upol.cz

Drug development workflow Vistoli G., et al., Drug Discovery Today, 2008, 13, 285-294

The challenge Chemical space Biological space ~10 36 drug-like compounds (1) ~10 4-10 5 human proteins {2} Drug discovery (1) Polishchuk, P. G.; Madzhidov, T. I.; Varnek, A., J Comput Aided Mol Des 2013, 27, 675-679. (2) http://www.uniprot.otg 3

Vastness of chemical space real datasets ~ 110 M compounds ~ 75 M compounds Commercial ~ 88 M compounds Free virtually enumerated dataset GDB-17 166 B compounds = 1.66x10 11 estimated size of drug-like chemical space 10 36 compounds 4

Screening High-throughput screening up to 10 6 of compounds can be tested expensive not all targets are suitable for HTS Virtual screening up to 10 9 of compounds can be tested 5

Virtual screening methods 3D structure of target unknown known Ligand-based Structure-based actives known actives and inactives known 1 or more actives few or more actives Similarity search Pharmacophore search Machine learning (QSAR) Docking capacity: ~ 10 9 ~ 10 7 ~ 10 7 ~ 10 6 complexity increases 6

Similarity search 3D structure of target unknown known Ligand-based Structure-based actives known actives and inactives known 1 or more actives few or more actives Similarity search Pharmacophore search Machine learning (QSAR) Docking capacity: ~ 10 9 ~ 10 7 ~ 10 7 ~ 10 6 complexity increases 7

Similarity principle Similar compounds have similar properties morphine codeine Rank and select similar compounds reference compound decrease similarity 8

Are they similar? morphine codeine S-thalidomide R-thalidomide tirofibane tirofibane ethyl ester eptifibatide

Similarity search Structure representation structural keys fingerprints Similarity measure Tanimoto Dice Euclidian morphine codeine Tanimoto Dice MACCS 0.94 0.969 Atom pairs 0.994 0.997 Morgan (ECFP4-like) 0.759 0.878 Morgan (FCFP4-like) 0.782 0.878 calculated with RDKit Similarity search output depends on descriptors and similarity measure selected 10

Similarity search: example agonists of CCR5 60 000 compounds 100 FCFP4 compounds IC 50 = 17 μm purchased & tested IC 50 = 5.8 μm IC 50 = 14.1 μm Kellenberger, E., et al., Identification of nonpeptide CCR5 receptor agonists by structure-based virtual screening. Journal of Medicinal Chemistry 2007, 50, 1294 1303. 11

Similarity search: conclusion + Little information is required to start searching + Different chemotypes can be retrieved + Ultra fast screening + Scaffold hopping - Hits will share common substructures with reference structures that may reduce their patentability - Results depend on chosen descriptors and similarity measure - Chemical similarity is not always followed by biological one 12

Pharmacophore search 3D structure of target unknown known Ligand-based Structure-based actives known actives and inactives known 1 or more actives few or more actives Similarity search Pharmacophore search Machine learning (QSAR) Docking capacity: ~ 10 9 ~ 10 7 ~ 10 7 ~ 10 6 complexity increases 13

Pharmacophore definition A pharmacophore is the ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interaction with a specific biological target structure and to trigger (or block) its biological response. Annu. Rep. Med. Chem. 1998, 33, 385 395 14

Early pharmacophore hypotheses 6.7 Å 6.9 Å 2.3 Å 2.4 Å π π interaction ionic interaction π π interaction X ionic interaction A H-bond B H-bond (R)-(-) Epinephrine (Adrenalin) (S)-(+) Epinephrine 15

Atom- and pharmacophore-based alignment Methotrexate Dihydrofolate Hydrogen bonding patterns Atom-based alignment Pharmacophore alignment 16

Feature-based pharmacophore models Features: Electrostatic interactions, H-bonding, aromatic interactions, hydrophobic regions, coordination to metal ions... Pharmacophore features H-bond donor H-bond acceptor Positive ionizable Negative ionizable Hydrophobic Aromatic ring 17

Feature-based pharmacophores (LigandScout) PDB code: 2VDM H-bonds formed by the ligand Hydrophobic interaction Pharmacophore features H-bond donor H-bond acceptor Positive ionizable Negative ionizable Hydrophobic 18

Typical ligand-based pharmacophore modeling workflow Clean structures of ligands Generate conformers Assign pharmacophore features Find common pharmacophores Score pharmacophore hypotheses Validation of pharmacophore models 19

Shared consensus pharmacophore (LigandScout) 2IVV 2IVU RET Kinase Inhibitors 20

Pharmacophore examples Shared model on 83 antagonists of fibrinogen receptor Pharmacophore models obtained for clusters of compounds Precision: 0.72 Recall: 0.77 Precision: 0.77 Recall: 0.27 Precision: 0.67 Recall: 0.29 Precision: 0.90 Recall: 0.04 Polishchuk, P. G. et al., Journal of Medicinal Chemistry 2015, 58, 7681-7694. 21

Pharmacophore example Ala scan NMR Urotensin II - ETPDc[CFWKYCV] potent vasoconstrictor IC 50 = 400 nm 500 hits Flohr S. et al., J. Med. Chem., 2002, 45 (9), pp 1799 1805 22

Pharmacophores: conclusion + Universal representation of binding pattern + Qualitative output + Very fast screening + Scaffold hopping - Structure-based models can be very specific - Ligand-based models depend on conformational sampling 23

Machine learning (QSAR) 3D structure of target unknown known Ligand-based Structure-based actives known actives and inactives known 1 or more actives few or more actives Similarity search Pharmacophore search Machine learning (QSAR) Docking capacity: ~ 10 9 ~ 10 7 ~ 10 7 ~ 10 6 complexity increases 24

Modeling of compounds properties Activity = F(structure) X 1 X 2 X 3 X 4 X 5 X 6 X N 1 0 9 0 11 1 1 4 0 1 0 0 0 1 0 0 0 0 0 4 6 0 2 3 6 0 0 3 4 0 0 0 1 2 1 Activity = M(E(structure)) M mapping function E encoding function 25

QSAR modeling workflow Structure Descriptors (features) End-point values Model X 1 X 2 X 3 X 4 X 5 X 6 X N 1 0 9 0 11 1 1 4 0 1 0 0 0 1 0 0 0 0 0 4 6 0 2 3 6 0 0 3 4 0 0 0 1 2 1 Y 1.1 1.4 6.8 3.0 1.5 Encoding (represent structure with numerical features) Mapping (machine learning) 26

Overall QSAR workflow Input data Preprocessing Feature engineering Model training Model validation Interpretation x ' i x i j x z j Bioassays Databases Data normalization & curation Feature extraction Feature selection Feature combination Classification Regression Clustering Cross-validation Bootstrap Test set Applicability Domain OECD principles for the validation, for regulatory purposes, of (Q)SAR models 1) a defined endpoint 2) an unambiguous algorithm 3) a defined domain of applicability 4) appropriate measures of goodness-of fit, robustness and predictivity 5) a mechanistic interpretation, if possible 27

QSAR model building plant growth inhibition activity of phenoxyacetic acids Hansch equation 1/C = 4.08π 2.14π 2 + 2.78σ + 3.38 π = logp X logp H σ - Hammet constant Free-Wilson models Inhibition activity of compounds against Staphylococcus aureus R is H or CH 3 ; X is Br, Cl, NO 2 and Y is NO 2, NH 2, NHC(=O)CH 3 Act = 75R H 112R CH3 + 84X Cl 16X Br 26X NO2 + 123Y NH2 + 18Y NHC(=O)CH3 218Y NO2

QSAR: example Antimalarial activity 7 hits, EC 50 < 2μM EC 50 = 95 nm Zhang L. et al., J. Chem. Inf. Model., 2013, 53 (2), pp 475 492 29

QSAR: conclusion + Qualitative and quantitative output + May work for compounds having different mechanisms of action + Fast screening - Very demanding to the quality of input data - Applicability limited by the training set structures - Hard to encode stereochemistry 30

Molecular docking 3D structure of target unknown known Ligand-based Structure-based actives known actives and inactives known 1 or more actives few or more actives Similarity search Pharmacophore search Machine learning (QSAR) Docking capacity: ~ 10 9 ~ 10 7 ~ 10 7 ~ 10 6 complexity increases 31

Docking is an in silico tool which predicts Bioorganic & Medicinal Chemistry, 20 (12), 2012, 3756 3767. Pose a possible relative orientation of a ligand and a receptor as well as conformation of a ligand and a receptor when they are form complex Score the strength of binding of the ligand and the receptor.

Why docking is a hard task Complex 3D jigsaw puzzle Conformational flexibility many degrees of freedom Mutual adaptation ( induced fit ) Solvation in aqueous media Complexity of thermodynamic contribution No easy route to evaluation of ΔG Simplification and heuristic approaches are necessary At its simplest level, this is a problem of subtraction of large numbers, inaccurately calculated, to arrive at a small number. (Leach A.R., Shoichet B.K., Peishoff C.E.. J. Med. Chem. 2006, 49, 5851-5855) 33

Sampling and scoring Protein-ligand docking software consists of two main components which work together: 1. Search algorithm (sampling) - generates a large number of poses of a molecule in the binding site. 2. Scoring function - calculates a score or binding affinity for a particular pose 34

Search algorithms (sampling) Ligand Rigid Receptor Rigid Fast & Simple Flexible Rigid Flexible Flexible Slow & Complex 35

Classes of scoring function Forcefield-based Based on terms from molecular mechanics forcefields GoldScore, DOCK, AutoDock Empirical Parameterised against experimental binding affinities ChemScore, PLP, Glide SP/XP Knowledge-based potentials Based on statistical analysis of observed pairwise distributions PMF, DrugScore, ASP 36

Docking quality assessment Energy reproducibility HIV-1 protease Test set size = 800 molecules r < 0.55 R. Wang J. Chem. Inf. Comput. Sci., 2004, 44 (6), pp 2114 2125 37

Docking quality assessment Antagonists of α IIb β 3 Activity, pic 50 PLANTS S4MPLE 6.77-77 -62 6.82-85 -69 7.96-88 -74 5.85-87 -76 5.89-93 -81 Krysko, A.A., et al., Bioorganic & Medicinal Chemistry Letters, 2016, 26, 1839-1843

Validation Self-docking reproducibility of a pose Docking of a set of ligands with known affinity reproducibility of affinity 39

Molecular docking: example β 2 -adrenoreceptor ligands 972 608 ZINC compounds DOCK diversity selection 500 compounds 25 compounds K i < 4μM 6 hits binding mode comparing to carazolol (X-ray ligand) K i = 9 nm Kolb, P. et al., PNAS of the United States of America, 2009, 106, 6843 6848. 40

Molecular docking: conclusion + Relatively fast + Determine binding poses & energy + Good in ranking ligands for virtual screening - Low accuracy of binding energy estimation - Require knowledge about binding site 41

Consensus virtual screening 2D Pharmacophore QSAR (ISIDA, SIRMS) 3D Pharmacophore (LigandScout) Field-similarity (Cresset) Docking (MOE, PLANTS, FlexX) Shape-similarity (ROCS) 42

Virtual screening BioinfoDB (curated ZINC) ~ 3 10 6 compounds 2D pharmacophore model 16Å + - 13 bonds (C sp3 C sp3 ) 210 compounds 3D pharmacophore (ligand-based) QSAR models 0 compounds 0 compounds No hits commercial libraries contain just a few number of zwitterionic compounds Polishchuk, P. G. et al., Journal of Medicinal Chemistry 2015, 58, 7681-7694.

Virtual screening of focused library 3D pharmacophore models ensemble QSAR 6930 (24066 stereisomers) 2064 Docking Shape- and field-based ADME/Tox, PASS, synthetic feasibility 141 75 20 2 Polishchuk, P. G. et al., Journal of Medicinal Chemistry 2015, 58, 7681-7694.

Biological activity of synthesized compounds Affinity for αiibβ3 Anti-aggregation 9.66 8.21 9.02 7.60 7.21 6.49 7.10 6.17 Tirofiban Polishchuk, P. G. et al., Journal of Medicinal Chemistry 2015, 58, 7681-7694. 8.62 7.48

Examples of CADD contribution to drug development Aliskiren Target Indication Role of CADD Renin Hypertension Structure-based optimization of the size of the hydrophobic group filling the large S1-S3 cavity. The methoxyalkoxy sidechain of the inhibitor is essential for strong binding Nilotinib Abl-kinase Chronic myeloid leukemia Replacement of imatinib N-methyl piperazine by a methylindole that optimally fits a subpocket of the kinase inactive structure Rilpivirine HIV reverse transcriptase AIDS Adaptation to the allosteric nonnucleoside binding site and targeting of the conserved Trp299 Rognan D., Pharmacology & Therapeutics 2017, 174, 47 66 46

PhD topic Computationally guided de novo design of compounds with desired properties Development of a platform for multi-objective optimization of compound properties (ADME, activity, selectivity, etc) based on computational models (QSAR, pharmacophore, docking, etc) Input structure(s) to be optimized QSAR model(s), docking Estimation of atoms/fragments contributions based on QSAR models, docking Transformation of structural motifs with negative influence on the target property Pharmacophore model(s), etc Property prediction Decision module: Is compound satisfy optimal criteria (profile)? Yes Selection, synthesis and testing No

Thank you for your attention 48