Chemical Space. Space, Diversity, and Synthesis. Jeremy Henle, 4/23/2013

Similar documents
DivCalc: A Utility for Diversity Analysis and Compound Sampling

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Introduction. OntoChem

Using Self-Organizing maps to accelerate similarity search

Computational chemical biology to address non-traditional drug targets. John Karanicolas

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

Structural biology and drug design: An overview

In silico pharmacology for drug discovery

COMBINATORIAL CHEMISTRY IN A HISTORICAL PERSPECTIVE

COMBINATORIAL CHEMISTRY: CURRENT APPROACH

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing

Development of a Structure Generator to Explore Target Areas on Chemical Space

De Novo molecular design with Deep Reinforcement Learning

Xia Ning,*, Huzefa Rangwala, and George Karypis

Structure-Activity Modeling - QSAR. Uwe Koch

Similarity Search. Uwe Koch

Machine learning for ligand-based virtual screening and chemogenomics!

Practical QSAR and Library Design: Advanced tools for research teams

An Integrated Approach to in-silico

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

CHAPTER 6 QUANTITATIVE STRUCTURE ACTIVITY RELATIONSHIP (QSAR) ANALYSIS

Data Mining in the Chemical Industry. Overview of presentation

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Statistical concepts in QSAR.

QSAR/QSPR modeling. Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships

Clustering Ambiguity: An Overview

Chemical Space: Modeling Exploration & Understanding

In Silico Investigation of Off-Target Effects

Evaluation of Molecular Similarity and Molecular Diversity Methods Using Biological Activity Data

Combinatorial Heterogeneous Catalysis

DATA ANALYTICS IN NANOMATERIALS DISCOVERY

Receptor Based Drug Design (1)

Keywords: anti-coagulants, factor Xa, QSAR, Thrombosis. Introduction

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space

Chemical library design

Introduction to Chemoinformatics and Drug Discovery

Assessing Synthetic Accessibility of Chemical Compounds Using Machine Learning Methods

Quantitative Structure-Activity Relationship (QSAR) computational-drug-design.html

Planar-Chiral Phosphine-Olefin Ligands Exploiting a (Cyclopentadienyl)manganese(I) Scaffold to Achieve High Robustness and High Enantioselectivity

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

A Simple Introduction of the Mizoroki-Heck Reaction

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Virtual affinity fingerprints in drug discovery: The Drug Profile Matching method

Tutorials on Library Design E. Lounkine and J. Bajorath (University of Bonn) C. Muller and A. Varnek (University of Strasbourg)

Hit Finding and Optimization Using BLAZE & FORGE

Introduction to Spark

Proximity data visualization with h-plots

Molecular Similarity Searching Using Inference Network

Notes of Dr. Anil Mishra at 1

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Dimension Reduction and Low-dimensional Embedding

Bioorthogonal Chemistry. Rachel Whittaker February 13, 2013 Wednesday Literature Talk

Plan. Day 2: Exercise on MHC molecules.

Using AutoDock for Virtual Screening

EMPIRICAL VS. RATIONAL METHODS OF DISCOVERING NEW DRUGS

The PhilOEsophy. There are only two fundamental molecular descriptors

Structure-based maximal affinity model predicts small-molecule druggability

Characterization of Pharmacophore Multiplet Fingerprints as Molecular Descriptors. Robert D. Clark 2004 Tripos, Inc.

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Literature Report 2. Divergent Asymmetric Total Synthesis of Mulinane Diterpenoids. Date :

Joseph Salamoun Current Literature 11/21/15 Wipf Group

Exploring the chemical space of screening results

Supporting Information (Part II) for ACS Combinatorial Science

Recent Advances of Alkyne Metathesis. Group Meeting Timothy Chang

What is a property-based similarity?

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Molecular Complexity Effects and Fingerprint-Based Similarity Search Strategies

Reporter: Yue Ji. Date: 2016/12/26

Universities of Leeds, Sheffield and York

Chiral Supramolecular Catalyst for Asymmetric Reaction

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

ISSN: (Online) Volume 3, Issue 5, May 2015 International Journal of Advance Research in Computer Science and Management Studies

Virtual Screening: How Are We Doing?

User Guide for LeDock

Mining Molecular Fragments: Finding Relevant Substructures of Molecules

Machine Learning Concepts in Chemoinformatics

sp 3 C-H insertion by α-oxo Gold Carbene B4 Kei Ito

Project I. Heterocyclic and medicinal chemistry

Overview. Descriptors. Definition. Descriptors. Overview 2D-QSAR. Number Vector Function. Physicochemical property (log P) Atom

CHEM 121: Chemical Biology

Structural Bioinformatics (C3210) Molecular Docking

Welcome to Week 5. Chapter 9 - Binding, Structure, and Diversity. 9.1 Intermolecular Forces. Starting week five video. Introduction to Chapter 9

Molecular Interactions F14NMI. Lecture 4: worked answers to practice questions

Fragment based drug discovery in teams of medicinal and computational chemists. Carsten Detering

Aromatic Compounds I

Principal component analysis (PCA) for clustering gene expression data

Dock Ligands from a 2D Molecule Sketch

Kernel-based Machine Learning for Virtual Screening

Drug Informatics for Chemical Genomics...

Basic principles of multidimensional NMR in solution

Correlation Preserving Unsupervised Discretization. Outline

Similarity methods for ligandbased virtual screening

Chapter 2 The Chemical Space of Flavours

Analyzing Building Blocks Diversity for DNA Encoded Library Design. Cresset User Group Meeting Nik Stiefl & Finton Sirockin, Novartis

Introduction to FBDD Fragment screening methods and library design

Tandem Mass Spectrometry: Generating function, alignment and assembly

Transcription:

Chemical Space Space, Diversity, and Synthesis Jeremy Henle, 4/23/2013

Computational Modeling Chemical Space As a diversity construct Outline Quantifying Diversity Diversity Oriented Synthesis

Wolf and Denmark, JACS, 2013, 135, 4743. Sigman, Science, 2011, 333, 1875 Computational Modeling Structural/Transition States Statistical Modeling

Drug Discovery/Catalyst Design Traditionally driven by intuition, experience Empirical Screening/HTS Investigate large numbers of compounds for activity Limitation: Are the compounds in large libraries actually different? In silico screening has often not lived up to its promise ~10 62 estimated small organic molecules Larger than the number of atoms in the Earth! (~10 50 atoms) Combinatorial libraries often contain ~10 6 different molecules. Goal: Maximize information gained with minimal number of compounds How is the minimum number of compounds determined? Soichet, Nature. 2004, 432, 862. Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

Typical Compound Library Skeletal Diversity Reymond, J. Chem. Inf. Model. 2012, 52, 2864

Consider a Catalyst Library 51-77% of molecule identical across the library Is this a diverse subset of catalysts? Xiang, Org. Proc. Res. Dev. 2010, 14, 692

Diversity What is diversity? Contain different # of atoms Aromatic and Nonaromatic Different polarities Cyclic and Acyclic No nitrogen atoms! No stereocenters! Need a method of quantifying chemical diversity

Describing Chemicals Quantitative Computational Representation of Compounds Fragments and Fingerprints (Chemical Graph Theory) Descriptor Based Review on chemical graph theory see Chem. Rev.2008, 108, 1127 Define a chemical space that compounds of interest inhabit Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

2D Molecular Descriptors Abstract Representation of Molecules Based on connectivity of atoms 3D Based on spatial arrangement of atoms Physiochemical (electronic) Whole molecule properties Review on chemical graph theory see Chem. Rev.2008, 108, 1127 Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

2D Descriptors Dependent upon the atom connectivity Reduced computational demand Can be calculated quickly Relevance Can illustrate differences between compounds, but do these descriptors have the information needed for asymmetric catalysis? Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Calculated with MOE 2011

3D Descriptors Based on the spatial arrangements of atoms Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Calculated with MOE 2011 Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

CoMFA Comparative Molecular Field Analysis Calculate steric and electrostatic interaction energies as molecular fields Can be compared between similar, aligned structures Rothenberg, Int. J. Mol. Sci. 2006, 7, 375.

Custom Descriptors Anything That Represents a Chemical Property 8000 gridpoints, 2 energies, 16000 descriptors per catalyst!

Chemical Space Multidimensional Descriptor Space Projection from 42 dimensional chemical space of PubChem Space spanned by all possible molecules and chemical compounds. Stoichiometric combinations of electrons and atomic nuclei in all possible topology isomers Not practical to look at EVERY descriptor for every application Method to graphically illustrate quantified chemical differences Beware of bias! Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Reymond, J. Chem. Inf. Model. 2013, 53, 509.

Chemical Space Practical Definition A certain set of properties denotes a an abstract chemical space. Molecular properties can be shared by multiple species, meaning multiple compounds can share locations in defined chemical space. Can be any number of dimensions! Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Hopkins and Lipinski, Nature. 2004, 432, 855.

Chemical Reactions Moving Through Chemical Space Reactions allow movement across chemical space Hopkins and Lipinski, Nature. 2004, 432, 855.

Chemical Reactions Moving Through Chemical Space Calculated with MOE 2011

Chemical Space As a Diversity Construct Usually desirable to examine compounds that encompass a wide region of chemical space Examine compounds in a chemical space that illustrates diversity within compound subset Diversity is a measure of spread within chemical space Diversity is a relative measure of a subset to the whole Hopkins and Lipinski, Nature. 2004, 432, 855. Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

Chemical Space Relevant Chemical Space Highly problem dependent Must answer the question: Is this a meaningful measure?

Examples of Chemical Space Biological Activity Qualitative demonstration of the activity of many drug-like compounds Hopkins and Lipinski, Nature. 2004, 432, 855.

Examples of Chemical Space 42 Dimensional Projections 42 Molecular Quantum Number (MQN) Descriptor Space Axes: Size vs Rigidity Color: Blue (lower value) to magenta(higher value) Axes: Rigidity vs Polarity Color: Blue (lower value) to magenta(higher value) 166 billion compounds! http://130.92.134.166:8080/pcbrowser2/ Reymond, J. Chem. Inf. Model. 2013, 53, 509

Examples of Chemical Space Topological Shape Encoded utilizing principle moments of inertia (X,Y) = (I small /I large, I medium /I large ) Of particular interest to drug screening. Many drugs are rod-like : flat aromatic compounds, no stereocenters Hopkins and Lipinski, Nature. 2004, 432, 855. Schreiber, Org. Lett. 2010, 12, 2822.

Examples of Chemical Space Topological Shape Calculated with MOE

Dimension Reduction Any number of descriptors can be used to define chemical space Hard to manipulate data in n > 3 dimensions At least, visualization is harder Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

Dimension Reduction Principle Component Analysis Take multivariate statistical information, determine orthogonal eigenvectors that maximize variance of original data Coefficients indicate what each component represents 5 dimensional data 2 dimensional data PC1 Steric energies and X/Y coord PC2 Electrostatic energies and Z coord

Quantification of Diversity Diverse Subsets: Distance Metrics Four requirements Distance from an object to itself is zero Distance values must be symmetric Distance values must obey the triangular inequality Distances between non-identical objects must be greater than zero Most commonly used: Euclidean Distance and Tanimoto Coefficient Once distance defined: d ij K k 1 x ij x jk 2 AND( x, y) T OR( x, y) Minimum Intermolecular Dissimilarity (Shortest Overall Distance) Average Nearest Neighbor Distance Hopkins and Lipinski, Nature. 2004, 432, 855. Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

Quantification of Diversity Other Metrics Cell Based (# compounds per cell) Variance Based Statistical analysis of descriptors Algorithm to maximize statistical variance Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

Perils of Chemical Space Must be careful of bias Statistics can be used to manipulate numbers to show trends that may not be reliable Descriptors that indicate diversity may not be reliable in any subsequent QSAR/QSSR modeling Cannot necessarily depend on original descriptor set in following calculations Different descriptors have different value ranges These ranges can affect bias Scaling is important How much is different Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138

Example Meaningless Diversity Xiang, Org. Proc. Res. Dev. 2010, 14, 692 Calculated with MOE 2011 Geom: PM3/MMFF

Chemical Space and Catalysis Sigman, Science, 2011, 333, 1875

Chemical Space and Catalysis Compound for the peak may not exist Sigman, Science, 2011, 333, 1875

Chemical Space and Diversity Summary Large number of ways to represent compounds in the computer Choice of descriptors not trivial Diversity oriented descriptors may not be those used for further statistical analysis Structural diversity vs. functional diversity Easy to bias the results Decorrelate data Manipulate data

Diversity Oriented Synthesis Moving beyond traditional chemical libraries Part of chemical genetics Probing gene products (proteins) with small molecules Spring, Org. Biomol. Chem., 2008, 6, 1149

Target-Oriented Synthesis Retrosynthetic analysis important Spring, Org. Biomol. Chem., 2008, 6, 1149

Combinatorial Libraries Useful for expanding singular hits Spring, Org. Biomol. Chem., 2008, 6, 1149

Traditional Combinatorial Library Very little skeletal diversity

Diversity Oriented Synthesis Expanding into New Chemical Space Forward synthetic analysis Requires efficient chemistry across diverse substrates Spring, Org. Biomol. Chem., 2008, 6, 1149

Diversity Oriented Synthesis Strategies Spring, Org. Biomol. Chem., 2008, 6, 1149

Spring, Chem. Commun., 2006, 3296 Spring, Org. Biomol. Chem., 2008, 6, 1149 Diversity Oriented Synthesis Skeletal Diversity Simple starting materials to complex scaffolds a: C 6 H 6, Rh 2 (O 2 CCF 3 ) 4 70% b: RCCH, Rh 2 (OAc) 4, (BuCCH, 57%) c: RNH 2, NaOH, then MeOH, H 2 SO 4 (MeNH 2, 35%) d: dieneophile (dimethyl acetylenedicarboxyate, 59%) e. Cyclopentadiene, 92%

Multiple Group Strategy for DOS endo (10:1) Pauson-Khand Reaction exo Cross-metathesis Exo enyne Schreiber, Org. Lett. 2010, 12, 2822.

Demonstrating 3D Shape Diversity Schreiber, Org. Lett. 2010, 12, 2822.

DOS of Macrocycles Common Reagent Approach Tan, D.S., Nat. Chem. Bio.2012, 8, 358

Macrocycle Diversity Analysis Tan, D.S., Nat. Chem. Bio.2012, 8, 358

Oxidative Ring Expansion Extension PhI(OAc) 2 Cu(BF 4 ) 2 or Tf 2 O/TsOH Yields 70-90% Tan, S.D. Nat. Bio. Chem. 2013, 9, 21

DOS Macrocycle Products Evaluating Diversity Original 20 dimensional dataset reduced to 3 principle components that account for 74% of original variance Tan, S.D. Nat. Bio. Chem. 2013, 9, 21

Complexity to Diversity DOS from Privileged Scaffolds Hergenrother, Nat. Chem. 2013, 5, 195

New Chemical Space? Hergenrother, Nat. Chem. 2013, 5, 195

Tanimoto Analysis Hergenrother, Nat. Chem. 2013, 5, 195

Diversity Oriented Catalyst Design 3.3% of possible CPP catalysts actually synthesized

Catalyst Skeletal Diversity? Future Directions Many catalyst screens only change small portions Ligands may have more variability Compare multiple scaffolds to one another in chemical space Right now, this is not really done How does one align different skeletal-diverse compounds?

Conclusions Chemical space allows for the visualization and quantification of chemical diversity User defined space may not encompass needed information Unintentional bias can influence interpretations Chemical diversity is still an abstract concept, without very much normalization or standardization Many methods exist for these types of investigations

Inspiration from Nature Schreiber, J. Comb. Chem. 2007, 9, 1028.

Skeletal DOS Reagent Based DOS Schreiber, J. Comb. Chem. 2007, 9, 1028.

Cyclohexadienone Synthesis Gram scale prep, yields general >90%

Ring Expansion Mechanism Tan, S.D. Nat. Bio. Chem. 2013, 9, 21

Pauson-Khand Rxn