Visualization and manipulation of Matched Molecular Series for decision support

Similar documents
Large scale classification of chemical reactions from patent data

Using Matched Molecular Series as a Predictive Tool To Optimize Biological Activity

Tutorial. Getting started. Sample to Insight. March 31, 2016

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Integrated Cheminformatics to Guide Drug Discovery

Similarity Search. Uwe Koch

JOHN MAYFIELD EGON WILLIGHAGEN CHEMISTRY DEVELOPMENT KIT V2.0

Open PHACTS Explorer: Compound by Name

Reaxys Medicinal Chemistry Fact Sheet

BHP BILLITON UNIVERSITY OF MELBOURNE SCHOOL MATHEMATICS COMPETITION, 2003: INTERMEDIATE DIVISION

MM-GBSA for Calculating Binding Affinity A rank-ordering study for the lead optimization of Fxa and COX-2 inhibitors

18.9 SUPPORT VECTOR MACHINES

The Case for Use Cases

cheminformatics toolkits: a personal perspective

To learn how to use molecular modeling software, a commonly used tool in the chemical and pharmaceutical industry.

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Structural biology and drug design: An overview

Empirical Application of Simple Regression (Chapter 2)

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

For the analytical solution, again use the Method of Undetermined Coefficients. Guess:

Important Aspects of Fragment Screening Collection Design

Hit Finding and Optimization Using BLAZE & FORGE

An Integrated Approach to in-silico

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Use of data mining and chemoinformatics in the identification and optimization of high-throughput screening hits for NTDs

Hydrogen Bonding & Molecular Design Peter

Introduction to Karnaugh Maps

Chemical bonds between atoms involve electrons.

Student Exploration: Chemical Equations - PAP Login to to complete this assignment.

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

molecules ISSN

a) DIATOMIC ELEMENTS e.g. . MEMORIZE THEM!!

Standardized Representations of ELN Reactions for Categorization and Duplicate/Variation Identification

Using Web Technologies for Integrative Drug Discovery

Linking Student Achievement, Teacher Professional Development, and the Use of Inquiry-based Computer Models in Science

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Cheminformatics Role in Pharmaceutical Industry. Randal Chen Ph.D. Abbott Laboratories Aug. 23, 2004 ACS

Information Extraction from Chemical Images. Discovery Knowledge & Informatics April 24 th, Dr. Marc Zimmermann

Lewis Structures. Lewis Structures. Lewis Structures. Lewis Structures. What pattern do you see? What pattern do you see?

ADME SCREENING OF NOVEL 1,4-DIHYDROPYRIDINE DERIVATIVES

Reaxys The Highlights

Introduction to Chemoinformatics and Drug Discovery

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Applying Bioisosteric Transformations to Predict Novel, High Quality Compounds

General Chemistry Lab Molecular Modeling

CHEM 115 Ionic & Molecular Bonding

Molecular Visualization. Introduction

The Molecule Cloud - compact visualization of large collections of molecules

REVIEW 8/2/2017 陈芳华东师大英语系

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

LESSON 2.2 WORKBOOK. How is a cell born? Workbook Lesson 2.2

Chapter 17. Equilibrium

Entropy-based data organization tricks for browsing logs and packet captures

CONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Dr. Dennis Gervin Sections: 4252, 4253, 7734,

BACK TO SCHO O L W ITH

Fast Sorting and Selection. A Lower Bound for Worst Case

SOL Study Book Fifth Grade Scientific Investigation, Reasoning, and Logic

40h + 15c = c = h

JCICS Major Research Areas

Structural interpretation of QSAR models a universal approach

Control of Gene Expression in Prokaryotes

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

De Novo molecular design with Deep Reinforcement Learning

So I have an SD File What do I do next? Rajarshi Guha & Noel O Boyle NCATS & NextMove So<ware

Chapter 8. Chapter 8. Preview. Bellringer. Chapter 8. Particles of Matter. Objectives. Chapter 8. Particles of Matter, continued

Student Exploration: Diffusion

The Fruit Fly Brain Observatory

Lab: Model Building with Covalent Compounds - Introduction

Information Retrieval

Karnaugh Maps Objectives

Study Guide Exam 1 BIO 301L Chinnery Spring 2013

Exploring the black box: structural and functional interpretation of QSAR models.

POGIL 6 Key Periodic Table Trends (Part 2)

Similarity methods for ligandbased virtual screening

Fragment based drug discovery in teams of medicinal and computational chemists. Carsten Detering

CHEM 181: Chemical Biology

Lecture 2. Introduction to ESRI s ArcGIS Desktop and ArcMap

Marvin 5.4 A new generation of structure indexing at Elsevier. Dr. Michael Maier, Dr. Heike Nau, Elsevier

Contrast Coding in R: An Exploration of a Dataset

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Reaction Mechanisms Dependence of rate on temperature Activation Energy E a Activated Complex Arrhenius Equation

Vocabulary: covalent bond, diatomic molecule, Lewis diagram, molecule, noble gases, nonmetal, octet rule, shell, valence, valence electron

Tautomerism in chemical information management systems

Directed Reading B. Section: Tools and Models in Science TOOLS IN SCIENCE MAKING MEASUREMENTS. is also know as the metric system.

The Doubly-Modifiable Structural Model (DMSM) The Modifiable Structural Model (MSM)

Create Satellite Image, Draw Maps

Mole Ratios. How can the coefficients in a chemical equation be interpreted? (g) 2NH 3. (g) + 3H 2

The Equilibrium State. Chapter 13 - Chemical Equilibrium. The Equilibrium State. Equilibrium is Dynamic! 5/29/2012

Chemical Data Retrieval and Management

MATH 10 INTRODUCTORY STATISTICS

Shifting Reactions B

Learning Algorithms with Unified and Interactive Web-Based Visualization

The Concept of Equilibrium

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Notes: Covalent Compounds

Section 2.7 Solving Linear Inequalities

Chapter 12. Molecular Structure

Emerging patterns mining and automated detection of contrasting chemical features

Summary of Derivative Tests

Transcription:

250 th ACS National Meeting, Boston 16 th Aug 2015 Visualization and manipulation of Matched Molecular Series for decision support Noel O Boyle and Roger Sayle NextMove Software

Matched (Molecular) Pairs 1.6 [Cl, F] 3.5 Coined by Kenny and Sadowski in 2005* Easier to predict differences in the values of a property than it is to predict the value itself * Chemoinformatics in drug discovery, Wiley, 271 285.

Successfully used for: Matched Pair usage Rationalising and predicting physicochemical property changes Finding bioisosteres Not very successful in improving activity Activity changes dependent on binding environment Need to look beyond matched pairs

Matched Series of length 2 = Matched Pair [Cl, F] Matching molecular series introduced by Wawer and Bajorath, J. Med. Chem. 2011, 54, 2944

Matched Series of length 3 [Cl, F, NH 2 ]

Ordered Matched Series of length 3 pic 50 3.5 [Cl > F > NH 2 ] 2.1 1.6

The Matched Pair mentality There can only be two Like inhabitants of Flatland ignorant of a third dimension What is the equivalent of pair for three? Triad, trio, triple? A matched pair represents a transformation from A->B How would that work if it there were three?

Matsy: Prediction using Matched Series

Matched Series have preferred orders Series Enrichment Observations Br > Cl > F > H 5.36* 256 Cl > Br > F > H 3.14* 150 H > F > Cl > Br 1.53* 73 Br > Cl > H > F 1.40 67 F > Cl > Br > H 1.36 65 Cl > F > Br > H 0.96 46 H > F > Br > Cl 0.77 37 H > Br > F > Cl 0.48* 23 Cl > H > F > Br 0.48* 23 Cl > F > H > Br 0.48* 23 H > Cl > F > Br 0.42* 20 Br > F > H > Cl 0.40* 19 F > H > Br > Cl 0.40* 19 H > Cl > Br > F 0.38* 18 F > Br > H > Cl 0.36* 17 Br > H > F > Cl 0.17* 8 The fact that certain orders are preferred may be used as the basis of a predictive method

Find R Groups that increase activity In-house Query matched series A > B A > B > C C > A > B D > A > B > C D > A > C > B E > D > A > B R Group Observations Obs that increase activity % that increase activity D 3 3 100 E 1 1 100 C 4 1 25 O Boyle, Boström, Sayle, Gill. J. Med. Chem. 2014, 57, 2704.

The Dataset-Centric approach Here is my dataset of molecules with activities now tell me what to make next Pro: Easy for users to get up-and-running Fits with their existing way of thinking Con: Don t need to think too much about matched series User is one step removed from the matched series data on which the predictions are actually based Dataset is fixed: cannot play with around with the prediction input

Goals for the interface Visual interface based around R-Groups as firstclass objects arranged in ordered series Promote new paradigm Make it clear that the scaffold is not involved Should help break the matched pair mentality Just a particular case of matched series Should be easy to play with Easy to manipulate and quick to respond

Drag-and-drop R Groups into slots to represent observed activity order The query matched series

Pros Easy to play around with Swap around order of R groups See what happens if you follow the predictions May suggest hypotheses Useful for searching (not just for predictions) Tablet-friendly ConS The user needs to be able to provide an ordered matched series as a query You can t just provide a dataset of molecules

No Chemistry required Predictions are solely based on the order of R groups in a matched series Not using any calculated properties Images of all R groups in ChEMBL can be generated in advance (~65K) A cheminformatics toolkit is not required for the interface or even for making predictions In practice, we do use a toolkit to allow the user to enter R groups as SMILES

Use Case #1 Are matched Series Predictions symmetric?

Are matched series predictions symmetric? If A>B>C>D is a highly preferred order Then D>C>B>A also tends to be preferred Hypothesis: if A reduces the activity given D>C>B it will also improve the activity given B>C>D If true, then we have twice as much data to use for predictions Let s find out.

? > Cl > F > H H > F > Cl >?

Use case #2 Topliss Decision Tree

Topliss Decision Tree Topliss, J. G. Utilization of Operational Schemes for Analog Synthesis in Drug Design. J. Med. Chem. 1972, 15, 1006 1011.

Topliss Decision Tree

Topliss Decision Tree (17 th )

ChEMBL-BASED Decision Tree (One of Many)

Visualization and manipulation of Matched Molecular Series for decision support http://nextmovesoftware.com noel@nextmovesoftware.com @nmsoftware