Chemical Space Space, Diversity, and Synthesis Jeremy Henle, 4/23/2013
Computational Modeling Chemical Space As a diversity construct Outline Quantifying Diversity Diversity Oriented Synthesis
Wolf and Denmark, JACS, 2013, 135, 4743. Sigman, Science, 2011, 333, 1875 Computational Modeling Structural/Transition States Statistical Modeling
Drug Discovery/Catalyst Design Traditionally driven by intuition, experience Empirical Screening/HTS Investigate large numbers of compounds for activity Limitation: Are the compounds in large libraries actually different? In silico screening has often not lived up to its promise ~10 62 estimated small organic molecules Larger than the number of atoms in the Earth! (~10 50 atoms) Combinatorial libraries often contain ~10 6 different molecules. Goal: Maximize information gained with minimal number of compounds How is the minimum number of compounds determined? Soichet, Nature. 2004, 432, 862. Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
Typical Compound Library Skeletal Diversity Reymond, J. Chem. Inf. Model. 2012, 52, 2864
Consider a Catalyst Library 51-77% of molecule identical across the library Is this a diverse subset of catalysts? Xiang, Org. Proc. Res. Dev. 2010, 14, 692
Diversity What is diversity? Contain different # of atoms Aromatic and Nonaromatic Different polarities Cyclic and Acyclic No nitrogen atoms! No stereocenters! Need a method of quantifying chemical diversity
Describing Chemicals Quantitative Computational Representation of Compounds Fragments and Fingerprints (Chemical Graph Theory) Descriptor Based Review on chemical graph theory see Chem. Rev.2008, 108, 1127 Define a chemical space that compounds of interest inhabit Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
2D Molecular Descriptors Abstract Representation of Molecules Based on connectivity of atoms 3D Based on spatial arrangement of atoms Physiochemical (electronic) Whole molecule properties Review on chemical graph theory see Chem. Rev.2008, 108, 1127 Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
2D Descriptors Dependent upon the atom connectivity Reduced computational demand Can be calculated quickly Relevance Can illustrate differences between compounds, but do these descriptors have the information needed for asymmetric catalysis? Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Calculated with MOE 2011
3D Descriptors Based on the spatial arrangements of atoms Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Calculated with MOE 2011 Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
CoMFA Comparative Molecular Field Analysis Calculate steric and electrostatic interaction energies as molecular fields Can be compared between similar, aligned structures Rothenberg, Int. J. Mol. Sci. 2006, 7, 375.
Custom Descriptors Anything That Represents a Chemical Property 8000 gridpoints, 2 energies, 16000 descriptors per catalyst!
Chemical Space Multidimensional Descriptor Space Projection from 42 dimensional chemical space of PubChem Space spanned by all possible molecules and chemical compounds. Stoichiometric combinations of electrons and atomic nuclei in all possible topology isomers Not practical to look at EVERY descriptor for every application Method to graphically illustrate quantified chemical differences Beware of bias! Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Reymond, J. Chem. Inf. Model. 2013, 53, 509.
Chemical Space Practical Definition A certain set of properties denotes a an abstract chemical space. Molecular properties can be shared by multiple species, meaning multiple compounds can share locations in defined chemical space. Can be any number of dimensions! Rothenberg, Int. J. Mol. Sci. 2006, 7, 375. Hopkins and Lipinski, Nature. 2004, 432, 855.
Chemical Reactions Moving Through Chemical Space Reactions allow movement across chemical space Hopkins and Lipinski, Nature. 2004, 432, 855.
Chemical Reactions Moving Through Chemical Space Calculated with MOE 2011
Chemical Space As a Diversity Construct Usually desirable to examine compounds that encompass a wide region of chemical space Examine compounds in a chemical space that illustrates diversity within compound subset Diversity is a measure of spread within chemical space Diversity is a relative measure of a subset to the whole Hopkins and Lipinski, Nature. 2004, 432, 855. Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
Chemical Space Relevant Chemical Space Highly problem dependent Must answer the question: Is this a meaningful measure?
Examples of Chemical Space Biological Activity Qualitative demonstration of the activity of many drug-like compounds Hopkins and Lipinski, Nature. 2004, 432, 855.
Examples of Chemical Space 42 Dimensional Projections 42 Molecular Quantum Number (MQN) Descriptor Space Axes: Size vs Rigidity Color: Blue (lower value) to magenta(higher value) Axes: Rigidity vs Polarity Color: Blue (lower value) to magenta(higher value) 166 billion compounds! http://130.92.134.166:8080/pcbrowser2/ Reymond, J. Chem. Inf. Model. 2013, 53, 509
Examples of Chemical Space Topological Shape Encoded utilizing principle moments of inertia (X,Y) = (I small /I large, I medium /I large ) Of particular interest to drug screening. Many drugs are rod-like : flat aromatic compounds, no stereocenters Hopkins and Lipinski, Nature. 2004, 432, 855. Schreiber, Org. Lett. 2010, 12, 2822.
Examples of Chemical Space Topological Shape Calculated with MOE
Dimension Reduction Any number of descriptors can be used to define chemical space Hard to manipulate data in n > 3 dimensions At least, visualization is harder Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
Dimension Reduction Principle Component Analysis Take multivariate statistical information, determine orthogonal eigenvectors that maximize variance of original data Coefficients indicate what each component represents 5 dimensional data 2 dimensional data PC1 Steric energies and X/Y coord PC2 Electrostatic energies and Z coord
Quantification of Diversity Diverse Subsets: Distance Metrics Four requirements Distance from an object to itself is zero Distance values must be symmetric Distance values must obey the triangular inequality Distances between non-identical objects must be greater than zero Most commonly used: Euclidean Distance and Tanimoto Coefficient Once distance defined: d ij K k 1 x ij x jk 2 AND( x, y) T OR( x, y) Minimum Intermolecular Dissimilarity (Shortest Overall Distance) Average Nearest Neighbor Distance Hopkins and Lipinski, Nature. 2004, 432, 855. Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
Quantification of Diversity Other Metrics Cell Based (# compounds per cell) Variance Based Statistical analysis of descriptors Algorithm to maximize statistical variance Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
Perils of Chemical Space Must be careful of bias Statistics can be used to manipulate numbers to show trends that may not be reliable Descriptors that indicate diversity may not be reliable in any subsequent QSAR/QSSR modeling Cannot necessarily depend on original descriptor set in following calculations Different descriptors have different value ranges These ranges can affect bias Scaling is important How much is different Gibbs, A.C. and Agrafoitis, D.K., Chemical Diversity: Definition and Quantification, Exploiting Chemical Diversity for Drug Discovery 2007, 138
Example Meaningless Diversity Xiang, Org. Proc. Res. Dev. 2010, 14, 692 Calculated with MOE 2011 Geom: PM3/MMFF
Chemical Space and Catalysis Sigman, Science, 2011, 333, 1875
Chemical Space and Catalysis Compound for the peak may not exist Sigman, Science, 2011, 333, 1875
Chemical Space and Diversity Summary Large number of ways to represent compounds in the computer Choice of descriptors not trivial Diversity oriented descriptors may not be those used for further statistical analysis Structural diversity vs. functional diversity Easy to bias the results Decorrelate data Manipulate data
Diversity Oriented Synthesis Moving beyond traditional chemical libraries Part of chemical genetics Probing gene products (proteins) with small molecules Spring, Org. Biomol. Chem., 2008, 6, 1149
Target-Oriented Synthesis Retrosynthetic analysis important Spring, Org. Biomol. Chem., 2008, 6, 1149
Combinatorial Libraries Useful for expanding singular hits Spring, Org. Biomol. Chem., 2008, 6, 1149
Traditional Combinatorial Library Very little skeletal diversity
Diversity Oriented Synthesis Expanding into New Chemical Space Forward synthetic analysis Requires efficient chemistry across diverse substrates Spring, Org. Biomol. Chem., 2008, 6, 1149
Diversity Oriented Synthesis Strategies Spring, Org. Biomol. Chem., 2008, 6, 1149
Spring, Chem. Commun., 2006, 3296 Spring, Org. Biomol. Chem., 2008, 6, 1149 Diversity Oriented Synthesis Skeletal Diversity Simple starting materials to complex scaffolds a: C 6 H 6, Rh 2 (O 2 CCF 3 ) 4 70% b: RCCH, Rh 2 (OAc) 4, (BuCCH, 57%) c: RNH 2, NaOH, then MeOH, H 2 SO 4 (MeNH 2, 35%) d: dieneophile (dimethyl acetylenedicarboxyate, 59%) e. Cyclopentadiene, 92%
Multiple Group Strategy for DOS endo (10:1) Pauson-Khand Reaction exo Cross-metathesis Exo enyne Schreiber, Org. Lett. 2010, 12, 2822.
Demonstrating 3D Shape Diversity Schreiber, Org. Lett. 2010, 12, 2822.
DOS of Macrocycles Common Reagent Approach Tan, D.S., Nat. Chem. Bio.2012, 8, 358
Macrocycle Diversity Analysis Tan, D.S., Nat. Chem. Bio.2012, 8, 358
Oxidative Ring Expansion Extension PhI(OAc) 2 Cu(BF 4 ) 2 or Tf 2 O/TsOH Yields 70-90% Tan, S.D. Nat. Bio. Chem. 2013, 9, 21
DOS Macrocycle Products Evaluating Diversity Original 20 dimensional dataset reduced to 3 principle components that account for 74% of original variance Tan, S.D. Nat. Bio. Chem. 2013, 9, 21
Complexity to Diversity DOS from Privileged Scaffolds Hergenrother, Nat. Chem. 2013, 5, 195
New Chemical Space? Hergenrother, Nat. Chem. 2013, 5, 195
Tanimoto Analysis Hergenrother, Nat. Chem. 2013, 5, 195
Diversity Oriented Catalyst Design 3.3% of possible CPP catalysts actually synthesized
Catalyst Skeletal Diversity? Future Directions Many catalyst screens only change small portions Ligands may have more variability Compare multiple scaffolds to one another in chemical space Right now, this is not really done How does one align different skeletal-diverse compounds?
Conclusions Chemical space allows for the visualization and quantification of chemical diversity User defined space may not encompass needed information Unintentional bias can influence interpretations Chemical diversity is still an abstract concept, without very much normalization or standardization Many methods exist for these types of investigations
Inspiration from Nature Schreiber, J. Comb. Chem. 2007, 9, 1028.
Skeletal DOS Reagent Based DOS Schreiber, J. Comb. Chem. 2007, 9, 1028.
Cyclohexadienone Synthesis Gram scale prep, yields general >90%
Ring Expansion Mechanism Tan, S.D. Nat. Bio. Chem. 2013, 9, 21
Pauson-Khand Rxn