Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Similar documents
Similarity Search. Uwe Koch

The Molecule Cloud - compact visualization of large collections of molecules

Overview. Descriptors. Definition. Descriptors. Overview 2D-QSAR. Number Vector Function. Physicochemical property (log P) Atom

In silico generation of novel, drug-like chemical matter using the LSTM deep neural network

Drug Informatics for Chemical Genomics...

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Introduction to Chemoinformatics and Drug Discovery

Receptor Based Drug Design (1)

Structural biology and drug design: An overview

Chemical Databases: Encoding, Storage and Search of Chemical Structures

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.

Large scale classification of chemical reactions from patent data

Data Mining in the Chemical Industry. Overview of presentation

Quest for the Rings. In Silico Exploration of Ring Universe To Identify Novel Bioactive Heteroaromatic Scaffolds

In Silico Investigation of Off-Target Effects

Introduction. OntoChem

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

EMPIRICAL VS. RATIONAL METHODS OF DISCOVERING NEW DRUGS

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Chemogenomic: Approaches to Rational Drug Design. Jonas Skjødt Møller

Solved and Unsolved Problems in Chemoinformatics

An Integrated Approach to in-silico

Structure-Activity Modeling - QSAR. Uwe Koch

DivCalc: A Utility for Diversity Analysis and Compound Sampling

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

Biologically Relevant Molecular Comparisons. Mark Mackey

Mining Molecular Fragments: Finding Relevant Substructures of Molecules

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

Increasing the Coverage of Medicinal Chemistry-Relevant Space in Commercial Fragments Screening

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

Universities of Leeds, Sheffield and York

Analyzing Small Molecule Data in R

Chemical Space. Space, Diversity, and Synthesis. Jeremy Henle, 4/23/2013

De Novo molecular design with Deep Reinforcement Learning

QSAR Modeling of Human Liver Microsomal Stability Alexey Zakharov

Machine Learning Concepts in Chemoinformatics

Machine learning for ligand-based virtual screening and chemogenomics!

Chemoinformatics and Drug Discovery

Plan. Day 2: Exercise on MHC molecules.

Bioinformatics Workshop - NM-AIST

Ignasi Belda, PhD CEO. HPC Advisory Council Spain Conference 2015

The Changing Requirements for Informatics Systems During the Growth of a Collaborative Drug Discovery Service Company. Sally Rose BioFocus plc

Open PHACTS Explorer: Compound by Name

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004

Early Stages of Drug Discovery in the Pharmaceutical Industry

Bioisosteres in Medicinal Chemistry

Using Self-Organizing maps to accelerate similarity search

Chapter 2 The Chemical Space of Flavours

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part 4: Selected Chapters

Introduction to Spark

LigandScout. Automated Structure-Based Pharmacophore Model Generation. Gerhard Wolber* and Thierry Langer

Chemical Reaction Databases Computer-Aided Synthesis Design Reaction Prediction Synthetic Feasibility

CDK & Mass Spectrometry

October 6 University Faculty of pharmacy Computer Aided Drug Design Unit

Alkane/water partition coefficients and hydrogen bonding. Peter Kenny

Introduction to Chemoinformatics

Developing CAS Products for Substructure Searching by Chemists. Linda Toler

In silico pharmacology for drug discovery

AMRI COMPOUND LIBRARY CONSORTIUM: A NOVEL WAY TO FILL YOUR DRUG PIPELINE

Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

The Schrödinger KNIME extensions

Analyzing Building Blocks Diversity for DNA Encoded Library Design. Cresset User Group Meeting Nik Stiefl & Finton Sirockin, Novartis

ChemBioNet: Chemical Biology supported by Networks of Chemists and Biologists. Affinity Proteomics Meeting Alpbach. Michael Lisurek 15.3.

COMPARISON OF SIMILARITY METHOD TO IMPROVE RETRIEVAL PERFORMANCE FOR CHEMICAL DATA

JCICS Major Research Areas

Chemical Ontologies. Chemical Ontologies. ChemAxon UGM May 23, 2012

Using AutoDock for Virtual Screening

est Drive K20 GPUs! Experience The Acceleration Run Computational Chemistry Codes on Tesla K20 GPU today

Relative Drug Likelihood: Going beyond Drug-Likeness

Medicinal Chemistry/ CHEM 458/658 Chapter 4- Computer-Aided Drug Design

The Schrödinger KNIME extensions

Data Quality Issues That Can Impact Drug Discovery

How Diverse Are Diversity Assessment Methods? A Comparative Analysis and Benchmarking of Molecular Descriptor Space

Analysis of a Large Structure/Biological Activity. Data Set Using Recursive Partitioning and. Simulated Annealing

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Design and Synthesis of the Comprehensive Fragment Library

Molecular Descriptors Theory and tips for real-world applications

Pipeline Pilot Integration

Targeting protein-protein interactions: A hot topic in drug discovery

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Development of a Structure Generator to Explore Target Areas on Chemical Space

KNIME-based scoring functions in Muse 3.0. KNIME User Group Meeting 2013 Fabian Bös

Chemical Data Retrieval and Management

Catching the Drift Indexing Implicit Knowledge in Chemical Digital Libraries

Applying Bioisosteric Transformations to Predict Novel, High Quality Compounds

Application of Computer in Chemistry SSC Prof. Mohamed Noor Hasan Dr. Hasmerya Maarof Department of Chemistry

Performing a Pharmacophore Search using CSD-CrossMiner

CSD. Unlock value from crystal structure information in the CSD

Practical QSAR and Library Design: Advanced tools for research teams

Research Article. Chemical compound classification based on improved Max-Min kernel

PIOTR GOLKIEWICZ LIFE SCIENCES SOLUTIONS CONSULTANT CENTRAL-EASTERN EUROPE

Kernels for small molecules

Driving Serendipity: the Era of Active Ingredients Discovery. Valentina Sedini Scientific Marketing

Case study: Category consistency assessment in Toolbox for a list of Cyclic unsaturated hydrocarbons with respect to repeated dose toxicity.

Chemical library design

Development of Pharmacophore Model for Indeno[1,2-b]indoles as Human Protein Kinase CK2 Inhibitors and Database Mining

COMBINATORIAL CHEMISTRY: CURRENT APPROACH

Structural Diversity of Organic Chemistry. A Scaffold Analysis of the CAS Registry

Transcription:

Navigation in Chemical Space Towards Biological Activity Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Data Explosion in Chemistry CAS 65 million molecules CCDC 600 000 structures PDB 78 000 proteins GenBank 145 million sequences

Chemical Space is Huge 67 million substances registered in the CAS 33 million compounds in the PubChem database many millions in archives of pharma / agro companies many millions available as commercial samples ~1 million molecules with (published) biological activity And VERY large number of possible (virtual) molecules. How to analyze the Chemical Universe? 1. characterization of molecules by proper descriptors 2. dimensionality reduction 3. user friendly visualization

Characterization of Chemical Space Over 8000 molecular descriptors available: R. Todeschini, V. Consonni, Molecular Descriptors for Chemoinformatics, Wiley-VCH, 2009 Descriptors suitable for large-scale analysis: global physicochemical properties = calculated descriptors (logp, PSA, HB donors and acceptors...) substructure features fragment counts, structural keys, pharmacophores, fingerprints... larger structural features rings, scaffolds, substituents

Handling Complex Data Matrices table with properties or fragments dimensionality reduction visualization Highly dimensional data matrices need to be reduced to few dimensions (optimally 2 to enable visual analysis) and the results provided in a visually appealing form as diagrams, graphs or maps to help us see and understand complex relationships within the data.

Molecular Property Space Dimensionality reduction: Bioavailability area PCA Multidimensional scaling ChemGPS, ChemMAP Clustering similarity problem Display of scaffold hierarchies 200,000 organic molecules, 10,000 drugs and development candidates

Structural Diversity Space organic molecules, drugs

Use of Molecule Lighthouses to Navigate in the Chemistry Space Chem-GPS system developed by Astra-Zeneca chemistry space is characterized by set of exotic molecules with extreme properties Use of marketed drugs, or other bioactive molecules many virtual screening techniques are based on drug-likeness identification of molecules similar to know bioactive structures Natural Products as a starting points NPs have been optimized in very long natural selection process for optimal interactions with biological macromolecules

Natural Products - Source of Bioactivity alkaloids fungi plants (marine) animals bacteria antibiotics toxins

Natural Products as a Source of New Drugs Natural products (NPs) have been optimized in a very long natural selection process for optimal interaction with biological macro-molecules. NPs are therefore an excellent source of validated substructures which may be used in the design of new bioactive compounds. Large part of current pharmacopeia consists of NPs and many other NPs are under development as new drugs. But what makes NP so successful in interacting with protein targets?

Global Molecular Properties Natural Products (130k) Bioactive molecules (120k) Synthetic molecules (150k)

Natural Products Bioactive molecules Synthetic molecules Mixed Chemistry Structural Space P. Ertl, A. Schuffenhauer, Cheminformatics Analysis of Natural Products, in: Natural Compounds as Drugs, Vol. II, Springer, 2008

Molecule Scaffold as Classifying Element removal of non-ring substituents molecule scaffold scaffold is the most important part of the molecule, giving it its shape and keeping substituents in their proper positions scaffold influence global molecular properties scaffolds often determine biological activity of the parent molecule provide easy understandable, common natural language between synthetic and computational chemists scaffolds play important role in combinatorial chemistry and scaffold hopping applications

The Scaffold Tree A method to classify molecules based on their scaffolds. Molecules are converted to their frameworks, then rings are removed one-by-one based on a set of predefined rules, creating such a scaffold hierarchy molecule molecule framework scaffold hierarchy The Scaffold Tree Visualization of the Scaffold Universe by Hierarchical Scaffold Classification A. Schuffenhauer, P. Ertl, S. Roggo, S. Wetzel, M. Koch, H. Waldmann, J. Chem. Inf. Model. 47, 47, 2007

Natural Product Scaffold Tree C rings N rings O rings S rings N,O rings N,S rings 150,000 synthetic molecules 110,000 natural products

Natural Product Scaffold Tree synthetic scaffolds NP scaffolds

Natural Product Scaffold Tree Cyclohexane branch containing cyclic terpenes, terpenoids and steroids

NP Scaffold Tree Dihydropyran and benzopyran branch

Natural Product Scaffold Tree Aromatic N and S heterocycles typical for synthetic molecules

Molecule Cloud P. Ertl and B. Rohde, J. Cheminformatics 4:accepted (2012)

Chemistry Space - Summary The chemistry space is huge and is growing nearly exponentially, we need reliable cheminformatics and data mining methods to navigate in this maze Chemical space may be characterized by numerous parameters, select those which are relevant to your problem Classification of chemistry space based on scaffolds is intuitive and provides good results We can use known areas of chemistry space as reference; natural products provide useful, evolutionary validated starting point for design of new bioactive molecules Visualization techniques are indispensable when analyzing large molecule collections - visualization in 2D (maps, trees, clouds) of complex data can help to understand the data