Chemical library design

Similar documents
Enamine Golden Fragment Library

Using AutoDock for Virtual Screening

Introduction to Chemoinformatics and Drug Discovery

Introduction. OntoChem

Structural biology and drug design: An overview

Virtual screening in drug discovery

Advanced Medicinal Chemistry SLIDES B

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004

Retrieving hits through in silico screening and expert assessment M. N. Drwal a,b and R. Griffith a

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

Plan. Day 2: Exercise on MHC molecules.

Design and Synthesis of the Comprehensive Fragment Library

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Unlocking the potential of your drug discovery programme

Biologically Relevant Molecular Comparisons. Mark Mackey

An Integrated Approach to in-silico

Drug Informatics for Chemical Genomics...

Receptor Based Drug Design (1)

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

Computational chemical biology to address non-traditional drug targets. John Karanicolas

ASSESSING THE DRUG ABILITY OF CHALCONES USING IN- SILICO TOOLS

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Fragment-based de novo Design

What is a property-based similarity?

A reliable computational workflow for the selection of optimal screening libraries

QSAR/QSPR modeling. Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships

Interactive Feature Selection with

Machine Learning Concepts in Chemoinformatics

Exploring the black box: structural and functional interpretation of QSAR models.

Data Quality Issues That Can Impact Drug Discovery

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

BioSolveIT. A Combinatorial Docking Approach for Dealing with Protonation and Tautomer Ambiguities

Ultra High Throughput Screening using THINK on the Internet

Introduction to FBDD Fragment screening methods and library design

Data Mining in the Chemical Industry. Overview of presentation

Structural interpretation of QSAR models a universal approach

Structure-Activity Modeling - QSAR. Uwe Koch

Similarity Search. Uwe Koch

FRAUNHOFER IME SCREENINGPORT

Development of a Structure Generator to Explore Target Areas on Chemical Space

bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012

Synthetic organic compounds

How IJC is Adding Value to a Molecular Design Business

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

Ignasi Belda, PhD CEO. HPC Advisory Council Spain Conference 2015

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

CHEM 4170 Problem Set #1

Molecular Dynamics Graphical Visualization 3-D QSAR Pharmacophore QSAR, COMBINE, Scoring Functions, Homology Modeling,..

BIOB111_CHBIO - Tutorial activity for Session 10. Conceptual multiple choice questions:

Emerging patterns mining and automated detection of contrasting chemical features

Hit Finding and Optimization Using BLAZE & FORGE

Targeting protein-protein interactions: A hot topic in drug discovery

ChemBioNet: Chemical Biology supported by Networks of Chemists and Biologists. Affinity Proteomics Meeting Alpbach. Michael Lisurek 15.3.

Supporting Information (Part II) for ACS Combinatorial Science

Chemical Space. Space, Diversity, and Synthesis. Jeremy Henle, 4/23/2013

MOLECULAR REPRESENTATIONS AND INFRARED SPECTROSCOPY

Functional Group Fingerprints CNS Chemistry Wilmington, USA

The use of Design of Experiments to develop Efficient Arrays for SAR and Property Exploration

Machine learning for ligand-based virtual screening and chemogenomics!

FRAGMENT SCREENING IN LEAD DISCOVERY BY WEAK AFFINITY CHROMATOGRAPHY (WAC )

Lead- and drug-like compounds: the rule-of-five revolution

Important Aspects of Fragment Screening Collection Design

Integrated Cheminformatics to Guide Drug Discovery

Synthetic organic compounds

Practical QSAR and Library Design: Advanced tools for research teams

Bridging the Dimensions:

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Bioorganic & Medicinal Chemistry

Automated Compound Collection Enhancement: how Pipeline Pilot preserved our sanity. Darren Green GSK

DivCalc: A Utility for Diversity Analysis and Compound Sampling

In silico pharmacology for drug discovery

Ligand-receptor interactions

Principles of Drug Design

A graph based approach to construct target focused libraries for virtual screening

COMBINATORIAL CHEMISTRY IN A HISTORICAL PERSPECTIVE

Virtual affinity fingerprints in drug discovery: The Drug Profile Matching method

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Relative Drug Likelihood: Going beyond Drug-Likeness

Use of data mining and chemoinformatics in the identification and optimization of high-throughput screening hits for NTDs

Progress of Compound Library Design Using In-silico Approach for Collaborative Drug Discovery

A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery

In Silico Investigation of Off-Target Effects

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Tutorials on Library Design E. Lounkine and J. Bajorath (University of Bonn) C. Muller and A. Varnek (University of Strasbourg)

Early Stages of Drug Discovery in the Pharmaceutical Industry

JCICS Major Research Areas

More information can be found in Chapter 12 in your textbook for CHEM 3750/ 3770 and on pages in your laboratory manual.

The Changing Requirements for Informatics Systems During the Growth of a Collaborative Drug Discovery Service Company. Sally Rose BioFocus plc

Hydrogen Bonding & Molecular Design Peter

De Novo molecular design with Deep Reinforcement Learning

Protein structure based approaches to inhibit Plasmodium DHODH for malaria

16 years ago TODAY (9/11) at 8:46, the first tower was hit at 9:03, the second tower was hit. Lecture 2 (9/11/17)

In Silico Design of New Drugs for Myeloid Leukemia Treatment

Virtual Screening: How Are We Doing?

Selecting Diversified Compounds to Build a Tangible Library for Biological and Biochemical Assays

Transcription:

Chemical library design Pavel Polishchuk Institute of Molecular and Translational Medicine Palacky University pavlo.polishchuk@upol.cz

Drug development workflow Vistoli G., et al., Drug Discovery Today, 2008, 13, 285-294

Leeson P.D. and Springthorpe B., Nature Reviews Drug Discovery, 2007, 6, 881-890

Applicability of approaches Unknown protein structure Known protein structure Unknown ligands structures Screening (bruteforce) De novo Known ligands structures Ligand-based similarity searching, pharmacophores, QSAR Structure-based Molecular docking, pharmacophores

Size of explored chemical space real datasets ~ 110 M compounds Commercial ~ 75 M compounds ~ 88 M compounds Free virtually enumerated dataset GDB-17 166 B compounds = 1.66x10 11

Estimated size of chemical space Number of compounds Limitations Method Reference size composition other 6,2 10 13 40 atoms* C, H Acyclic alkanes exhaustive enumeration H.R. Henze, C.M. Blair, 1931 [4] without stereoisomers 1,3 10 15 38 atoms* C, H Acyclic exhaustive enumeration C.M. Blair, H.R. Henze, 1932 [5]. stereoisomeric alkanes 10 21 < 7Å 40 functional groups Neurological drugs combinatorial enumeration D. F. Weaver, C. A. Weaver, 2011 [8] 10 23 36 atoms C, N, O, S, P, Se, Si, Hal Scaffold with 2 or 3 combinatorial estimation P. Ertl, 2002 [7] attachment points 10 26 50 atoms C, N, O, S, Cl - combinatorial enumeration K. Ogata et al., 2007 [24] 10 33 750 Da C, N, O, F Heptanes and hexanes including stereoisomers 10 36 36 atoms, 500 Da C, N, O, S, Hal Stable compounds including stereoisomers combinatorial enumeration Learning of exhaustively enumerated structures from GDB-17 10 60 30 atoms C, N, O, S - combinatorial enumeration 10 390 300 amino Natural amino acids Proteins Possible number of acid combinations of amino residues acids *Polishchuk, P. G.; Madzhidov, T. I.; Varnek, A. J Comput Aided Mol Des 2013, 27, 675-679 D. Weininger, 2002 [23] This work* R. S. Bohacek et al. 1996 [6] C.M. Dobson, 2004 [28] 10 82 atoms in the Universe (http://www.universetoday.com/36302/atoms-in-the-universe/)

How to select compounds for screening? Human experts?

Human experts Lajiness M.S., et. al. J. Med. Chem., 2004, 47, 4891-4896

How to select compounds for screening? Human experts? Physico-chemical filters?

Physico-chemical filters (rule of 5) Lipinski C.A., et al. analyzed 2245 small molecules which reached phase II clinical trials and setup the threshold to remain for 90% of compounds Properties relevant for absorption and permeation: Molecular weight (MW) 500 Lipophilicity (CLogP) 5 H-bond donors 5 H-bond acceptors 10

Physico-chemical filters for drug-likeness MW MlogP HBD HBA RTB nrings Formal charge TPSA, Å Year Author 500 4.15 5 10 1997 Lipinski 200-450 -2.0-4.5 5 1-8 1-9 5 2000 Oprea 200-500 -5.0-5.0 5 10 8-2 - +2 2002 (1998) Walters 10 140 2002 Veber 500 5 5 10 10 140 2003 Rishton C.A. Lipinski et. al., Advanced Drug Delivery Reviews,1997, 23, 3-25 T.I. Oprea et al., J. Comp.-Aided Mol. Design, 2000, 14, 251-264 W.P. Walters and M.A. Murcko, Adv. Drug Deliv. Rev., 2002, 54, 255-271 D.F. Veber et al., J. Med. Chem., 2002, 45, 2615-2623 G.M. Rishton, Drug Discov. Today, 2003, 8, 86-96

Lead-likeness filters Drug +69 +1.8 +1 +2 0 +1 +0.43 +0.97 ΔMW ΔCMR ΔnRings ΔRTB ΔHBD ΔHBA ΔCLogP ΔLogD 7.4 Lead Lead-like compounds: MW 350; CLogP 3 Oprea T.I. et al., J. Chem. Inf. Comput. Sci., 2001, 41, 1308 1315 Teague S.J. et al., Angew. Chem. Int. Ed., 1999, 38, 3743-3748

How to select compounds for screening? Human experts? Physico-chemical filters? Structural filters?

Structural filters Remove potentially toxic and mutagen compounds Remove metabolically liable compounds Remove false positives: interference of signal detection (e.g. dyes for fluorescent assays) reactive groups aggregates non-specific binders (sticky compounds)

reactive structures: Michael acceptors: C=C-C=O, C=C-CN, C=C-NO 2 anhydride alpha haloketone peroxide frequent hitters: more then two nitro groups dihydroxybenzene dye-like structures: two nitro group on the same aromatic ring unlikely drug candidates: large rings (>C 9 ) crown ethers conjugated alkenes: C=CC=CC=C Structural filters AstraZeneca filters Cumming J.G. et al., Nature Reviews Drug Discovery, 2013, 948 962 'ugly' halogens: 2- or 3-valent halogens triflates: SO 2 CX 3 'ugly' oxygen: 5 or more OH groups formic acid esters 'ugly' nitrogen: hydrazines (not in ring) oxime carbodiimide 3 or more guanidines 'ugly' sulfur: 5 or more S atoms disulfide thiocyanate thiol

Promiscuous compounds malarial protease, IC 50 (μm) --- 8 β-lactamase, IC 50 (μm) 0.2 10 chymotrypsin, IC 50 (μm) --- 55 IC 50 with incubation no change 22-fold IC 50 with 10x β-lactamase no change 40-fold DLS concentration, μm 100 10 particle diameter, nm no particles 394.6 ± 12.5 IC 50 in presence of guanidine --- 6-fold IC 50 in presence of urea --- 4-fold IC 50 in presence of BSA --- > 50-fold K 3 PO 4, 5mM 0.2 4 K 3 PO 4, 50mM 0.2 10 K 3 PO 4, 500mM 0.3 15 McGowarn S.L. et al, J. Med. Chem., 2002, 1712-1722

Promiscuity as a function of structure Leeson P.D. and Springthorpe B., Nature Reviews Drug Discovery, 2007, 6, 881-890

Promiscuity as a function of structure Hopkins A.L. et al., Current Opinion in Structural Biology, 2006, 16, 127 136

Pan-assay interference compounds (PAINS) Baell J.and Walters M., Nature, 2014, 513, 481-483

Pan-assay interference compounds (PAINS) Baell J.B and Holloway G.A., J. Med. Chem., 2010, 53, 2719 2740

Pan-assay interference compounds (PAINS) not passed passed Baell J.B and Holloway G.A., J. Med. Chem., 2010, 53, 2719 2740

Pan-assay interference compounds (PAINS) Baell J.and Walters M., Nature, 2014, 513, 481-483

Promiscuous compounds: conclusion Promiscuous compounds have: higher lipophilicity low complexity reactive

Dark matter of HTS libraries Macarron R., Nature Chemical Biology, 2015, 11, 904 905

Dark matter of HTS libraries Dark chemical matter is a compound which was inactive in 100 or more assays Assays Number of compounds Activity overall dark matter threshold Novartis 234 803 990 112 872 (14.0%) z-score is 2 or more PubChem 429 363 598 131 726 (36.2%) standard (IC 50 < 10μM) Quality control of Novartis dataset Wassermann A.M. et al., Nature Chemical Biology, 2015, 11, 958 966

Dark matter of HTS libraries Compounds which are centers of clusters with actives (greens) dark mater (black) Wassermann A.M. et al., Nature Chemical Biology, 2015, 11, 958 966

Dark matter of HTS libraries Active Dark Active Dark Wassermann A.M. et al., Nature Chemical Biology, 2015, 11, 958 966

Dark matter of HTS libraries Structural rules to discriminate dark chemical matter Wassermann A.M. et al., Nature Chemical Biology, 2015, 11, 958 966

Dark matter of HTS libraries Compounds were tested in 34 additional assays Hit rates and selectivity Wassermann A.M. et al., Nature Chemical Biology, 2015, 11, 958 966

Dark matter of HTS libraries: conclusion Dark matter compounds: less potent more selective If a compound is inactive in 100 assays it can be active in the next one. There is no correlation. Dark chemical matter is a valuable resource of potent and selective compounds which should be tested in higher concentrations.

How to select compounds for screening? Human experts? Physico-chemical filters? Structural filters? Chemoinformatics?

Similarity and diversity Similar property principle: structurally similar compounds tend to exhibit similar properties Select of subset of size N from dataset of size M can be done in M! N!( M N)! ways It means that there are ~10 13 ways to select 10 compounds from 100

Diverse libraries Similarity/dissimilarity measures Dissimilarity = 1 -Similarity Euclidean distance Tanimoto Diversity measures Sum of pairwise dissimilarities/distances Similarity is a property of a pair of compounds Diversity is a property of a library of compounds

Diverse libraries: basic algorithm 1. Select a compound and place it the subset (randomly, least similar, etc) 2. Calculate dissimilarity between each remaining compound and compounds in the subset 3. Choose next compounds which is the most dissimilar to compounds in the subset 4. If less then N compounds were selected, return to step 2 fast can be used in high dimensional data tend to select outliers

Diverse libraries: clustering Group compounds: compounds with the cluster are similar compounds from different clusters are dissimilar good for high dimensional data reveals natural clustering not suitable for big datasets

Diverse libraries: sphere exclusion 1. Define threshold similarity T 2. Select a compounds from the data set and place it in the subset 3. Remove all compounds with dissimilarity < T 4. If compounds left in the data set, return to step 2 T

Diverse libraries: cell-based 1. Split the whole space on cells 2. Select one (or more) compounds from each cell fast works only in low dimensional space if there are 100 dimensions (descriptors) with 2 splits in each it will be 2 100 =10 30 cells

Library filters H-bond acceptor lipophilicity MW H-bond donor metabolically liable reactive toxic PAINS diversity Physico-chemical filters Structural filters Diversity filters

PhD topic Computationally guided de novo design of compounds with desired properties Development of a platform for multi-objective optimization of compound properties (ADME, activity, selectivity, etc) based on computational models (QSAR, pharmacophore, docking, etc) Input structure(s) to be optimized QSAR model(s), docking Estimation of atoms/fragments contributions based on QSAR models, docking Transformation of structural motifs with negative influence on the target property Pharmacophore model(s), etc Property prediction Decision module: Is compound satisfy optimal criteria (profile)? Yes Selection, synthesis and testing No

Thank you for your attention

Promiscuous compounds / frequent hitters