ALOGPS is a free on-line program to predict lipophilicity and aqueous solubility of chemical compounds

Size: px
Start display at page:

Download "ALOGPS is a free on-line program to predict lipophilicity and aqueous solubility of chemical compounds"

Transcription

1 ALOGPS is a free on-line program to predict lipophilicity and aqueous solubility of chemical compounds Igor V. Tetko & Vsevolod Yu. Tanchuk IBPC, Ukrainian Academy of Sciences, Kyiv, Ukraine and Institute for Bioinformatics, Munich, Germany March 5th, ACS

2 ALOGPS 2. LogP: 75 input variables corresponding to electronic and topological properties of atoms (E-state indices), 298 molecules in the database, 64 neural networks in the ensemble. Calculated results RMSE=.35, MAE=.26, n=76 outliers (>.5 log units) LogS: 33 input E-state indices, 29 molecules in the database, 64 neural networks in the ensemble. Calculated results RMSE=.49, MAE=.35, n=8 outliers (>.5 log units) Tetko, Tanchuk & Villa, JCICS, 2, 4, Tetko, Tanchuk, Kasheva & Villa, JCICS, 2, 4, Tetko & Tanchuk, JCICS, 22, 42,

3 Representation of molecules SMILES (no stereoisomers) H, A -- number of hydrogen and non-hydrogen atoms E-state indexes developed by Kier & Hall 7 Cl Basic atom-type E-state indexes Extended atom-type Bond-type 2 O HO H O 4 O 3 Advantages to use atom-type E-state indices: o missed fragments! ==> non-linear dependencies indices/property Atom type E-state: SdO, SsOH,SsCl, Extended: SdO(nitro), SdO(acid), Bond-type: e2o2, eac3c3aa,ec3cla,...

4 MLRA vs eural etwork method indices R 2 RMSE MLRA 75 E-state indices, including A, H.89.6 AS the same logp=2.65 logp=5. logp=2.28 Cl logp=.2 Ssss logp=5.8

5 Prediction of AstraZeneca logp set n=2569 ACDlogP (v. 7.): MAE =.86, RMSE=.2 CLOGP (v. 4.7): MAE =.7, RMSE=.7 ALOGPS: MAE =.6, RMSE=.84 Tetko & Bruneau, J. Pharm. Sci., 24, 94, 33-3.

6 ALOGPS: Extrapolation vs Interpolation BASF, n=6 AstraZeneca, n=2569 Tetko, JCICS, 22, 42, Tetko & Bruneau, J. Pharm. Sci., 24, 94, 33-3.

7 Prediction Space of the model does not cover the in house compounds Training set data In house data

8 Possible strategies Generate new indices and build a new model --method is used by fragment-based approaches (ACDLabs, PharmaAlgorithms) provides an improvement but may have danger of overfitting that can lower prediction ability Do not generate new indices/model but to extend the model into the uncovered space and correct the model using k-- no danger of overfitting (Associative eural etwork)

9 Training set model Interpolation

10 Prediction of the test compounds extrapolation

11 ew data prediction

12 k- correction ! We can make an adjustment of the predicted value by identifying the nearest neighbors () of the analyzed data case How to detect the nearest neighbors? Why we need for this ensemble of models?

13 earest neighbors for Gauss function A y B y x x Detection of nearest neighbors in space of models make us of invariants structureproperty and it is much more accurate than in the initial space of parameters C x x Y=f(x +x 2 ) x

14 Early Stopping Over Ensemble (ESE) Learning/Validation Sets Artificial eural etwork Training Set no network no Artificial eural etwork Ensemble Analysis Set no 2 network no 2 Learning Partition InitialTraining Set Set no 99 network no 99 Validation Set no 2 network no 2

15 AS: an example correction logp=3. HO logp=3.48 HO " 2.3% $ $ 4.6 $ M $ $ 3.2 # $.& " 3.7% $ $ 4.8 $ M $ $ 5.8 # $ 2.& " net % $ $ net 2 $ M $ $ net 63 # $ net 64& " net % $ $ net 2 $ M $ $ net 63 # $ net 64& -k correction Morphinan-3-ol, 7-methyl- Calculated logp=3.65, δ= > =2.89 (δ=+.22) Levallorphan Calculated logp=4.24, δ= > =3.7 (δ=+.22) -- both molecules are the nearest neighbors, r 2 =.47, in space of residuals!

16 Associative eural etwork (AS) A prediction of case i: [ x i ] [ AE] M = [ z i ] = " $ $ $ $ $ # $ z i M z k i M z M i % & Ensemble approach: z =! i i z k M k =, M Pearson s (Spearman) correlation coefficient r ij =R(z i,z j )> in space of residuals z$ i = z i + k j"!( y # ) j z j k ( x i ) <<= AS bias correction The correction of neural network ensemble value is performed using errors (biases) calculated for the neighbor cases of analyzed case x i detected in space of neural network models

17 earest neighbors for Gauss function A y B y x x C x 2 4 Detection of nearest neighbors in space of models make us of invariants structureproperty and it is much more accurate than in the initial space of parameters x x

18 Gauss function extrapolation A y B y.8.8 Advantages: fast, no neural network retraining; correction is not limited by the range of values in the training set. C y x D y x otice: y=f(x=x +x 2 ) x

19 Property-based clustering A logp Pearsons linear correlation coefficient, R 3.3 (.53) aphthalene B -.26 (.94) Pyrazine A: lipophilicity prediction Cl 2.84 (.58) Phenyl Cloride -.4 (.94) Pyrimidine B: molecular weight prediction O 2. (.59) Anisole 2.73 (.6) Toluene 2.73 (.94) Toluene.65 (.97) Pyridine 2.3 Benzene C D Tetko, JCICS, 22, 42,

20 ALOGPS: Extrapolation vs Interpolation ALOGPS logp (blind) :MAE =.27, RMSE=.63 ALOGPS logp (LIBRARY):MAE =.49, RMSE=.7 Tetko, JCICS, 22, 42, Tetko & Bruneau, J. Pharm. Sci., 24, 94, 33-3.

21 Function training and real

22 new points are measured "LogD".75 training "LogP"

23 Library mode (no retraining) Library "LogD" "LogD".75 training "LogP"

24 Library mode vs training using cases

25 Analysis of Pfizer data Pallas PrologD : ACDlogD (v. 7.9): ALOGPS: ALOGPS LIBRARY: MAE =.6, RMSE=.4 MAE =.97, RMSE=.32 MAE =.92, RMSE=.7 MAE =.43, RMSE=.64 Tetko & Poda, J. Med. Chem., 24, 94,

26 PHYSPROP data set Total: 298 nova set star set CLOGP training nova --> prediction star set XLOGP 873 PHYSPROP 2 98

27 Prediction performance as function of similarity in space of star set models Blind prediction LIBRARY mode max correlation coefficient of a test compound to training set compounds MAE=.43 max correlation coefficient of a test compound to LIBRARY compounds MAE=.28 (.26)

28 Reliability of new compound predictions r error >.7 ~.6 ~.5 ~.4 <.3 CI, 25, 2, 65, PHYSPROP

29 Reliability of new compound predictions r error >.7 ~.6 ~.5 ~.4 <.3 CI, 25, 2, 65, Aurora data PHYSPROP

30 Aqueous solubility / logp prediction for Pfizer data Towards Predictive ADME Profiling of Drug Candidates: Lipophilicity and Solubility Gennadiy Poda, Igor Tetko and Douglas C. Rohrer MEDI Wednesday, March 6th

31 Conclusion The LIBRARY mode significantly improves prediction for in house logp/s and logd data sets The LIBRARY mode can be used with very small number of compounds, i.e one. This number will not be adequate to create new model from scratch The improvement in this mode due to presence of invariants conserved both for training and test sets The LIBRARY mode can be used for non-stationary and contradiction data An apparent success of logd prediction suggest that similar indices dominates in logp and pka properties

32 Acknowledgement Part of this presentation was done thanks to Virtual Computational Chemistry Laboratory ITAS-IFO project ( I thank Prof Hugo Kubinyyi, Drs Pierre Bruneau and Gennadiy Poda for collaboration and Prof. Tudor Oprea for inviting me to participate in this conference. Thank you for your attention!

33 Free on-line/batch analysis on

Encoding molecular structures as ranks of models: A new, secure way for sharing chemical data and development of ADME/T models. Igor V.

Encoding molecular structures as ranks of models: A new, secure way for sharing chemical data and development of ADME/T models. Igor V. Encoding molecular structures as ranks of models: A new, secure way for sharing chemical data and development of ADME/T models Igor V. Tetko IBPC, Ukrainian Academy of Sciences, Kyiv, Ukraine and Institute

More information

What is a property-based similarity?

What is a property-based similarity? What is a property-based similarity? Igor V. Tetko (1) GSF - ational Centre for Environment and Health, Institute for Bioinformatics, Ingolstaedter Landstrasse 1, euherberg, 85764, Germany, (2) Institute

More information

Can we estimate the accuracy of ADMET predictions?

Can we estimate the accuracy of ADMET predictions? Can we estimate the accuracy of ADMET predictions? Igor V. Tetko 1, Pierre Bruneau 2, Hans-Werner Mewes 1, Douglas Rohrer 3, and Gennadiy Poda 3 (1) GSF - National Centre for Environment and Health, Institute

More information

In Silico Prediction of ADMET properties with confidence: potential to speed-up drug discovery

In Silico Prediction of ADMET properties with confidence: potential to speed-up drug discovery In Silico Prediction of ADMET properties with confidence: potential to speed-up drug discovery Igor V. Tetko Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH) Institute

More information

Surrogate data a secure way to share corporate data

Surrogate data a secure way to share corporate data Journal of Computer-Aided Molecular Design (2005) 19: 749 764 DI 10.1007/s10822-005-9013-3 urrogate data a secure way to share corporate data Igor V. Tetko a,d, *, Ruben Abagyan b & Tudor I. prea c a Institute

More information

Computing chemistry on the web. Igor V. Tetko

Computing chemistry on the web. Igor V. Tetko Computing chemistry on the web Igor V. Tetko Institute of Bioorganic & Petrochemistry, Murmanskaya 1, 02094 Kiev, Ukraine and Institute for Bioinformatics, GSF, Neuherberg, Germany, http://www.vcclab.org

More information

Assessing and Improving the Reliability of In Silico Property Predictions by Incorporating Inhouse. Pranas Japertas

Assessing and Improving the Reliability of In Silico Property Predictions by Incorporating Inhouse. Pranas Japertas Assessing and Improving the Reliability of In Silico Property Predictions by Incorporating Inhouse Data Pranas Japertas Outline Why do in silico models fail? Knowing when in silico models fail assessing

More information

Easier and Better Exploitation of PhysChem Properties in Medicinal Chemistry

Easier and Better Exploitation of PhysChem Properties in Medicinal Chemistry www.acdlabs.com Easier and Better Exploitation of PhysChem Properties in Medicinal Chemistry Dr. Sanjivanjit K. Bhal Advanced Chemistry Development, Inc. (ACD/Labs) Evolution of MedChem

More information

QSPR MODELLING FOR PREDICTING TOXICITY OF NANOMATERIALS

QSPR MODELLING FOR PREDICTING TOXICITY OF NANOMATERIALS QSPR MODELLING FOR PREDICTING TOXICITY OF NANOMATERIALS KOVALISHYN Vasyl 1, PEIJNENBURG Willie 2, KOPERNYK Iryna 1, ABRAMENKO Natalia 3, METELYTSIA Larysa 1 1 Institute of Bioorganic Chemistry & Petroleum

More information

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences

György M. Keserű H2020 FRAGNET Network Hungarian Academy of Sciences Fragment based lead discovery - introduction György M. Keserű H2020 FRAGET etwork Hungarian Academy of Sciences www.fragnet.eu Hit discovery from screening Druglike library Fragment library Large molecules

More information

SAMPL6 pka Challenge

SAMPL6 pka Challenge SAMPL6 pka Challenge Mehtap Isik Chodera Lab, MSKCC D3R 2018 Workshop February 22nd, 2018 0 Why did we decide to organize a blind pka prediction challenge? SAMPL5 logd challenge indicated the impact of

More information

Chemical Space: Modeling Exploration & Understanding

Chemical Space: Modeling Exploration & Understanding verview Chemical Space: Modeling Exploration & Understanding Rajarshi Guha School of Informatics Indiana University 16 th August, 2006 utline verview 1 verview 2 3 CDK R utline verview 1 verview 2 3 CDK

More information

Chromatographic Hydrophobicity Index (CHI) Application to Agrochemical Research. Eric Clarke & Beth Shirley, Chemistry Design Group, Jealott s Hill

Chromatographic Hydrophobicity Index (CHI) Application to Agrochemical Research. Eric Clarke & Beth Shirley, Chemistry Design Group, Jealott s Hill Chromatographic Hydrophobicity Index (CHI) Application to Agrochemical Research Eric Clarke & Beth Shirley, Chemistry Design Group, Jealott s Hill Properties affecting agrochemical availability and movement

More information

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME Iván Solt Solutions for Cheminformatics Drug Discovery Strategies for known targets High-Throughput Screening (HTS) Cells

More information

International Journal of PharmTech Research CODEN (USA): IJPRIF, ISSN: Vol.8, No.3, pp , 2015

International Journal of PharmTech Research CODEN (USA): IJPRIF, ISSN: Vol.8, No.3, pp , 2015 International Journal of PharmTech Research CDE (USA): IJPRIF, ISS: 0974-4304 Vol.8, o.3, pp 408-414, 2015 AlogP calculation of octanol/water partition coefficient of ferrocene derivatives Ridha Ahmedi,

More information

A Picture Paints a Thousand Words Visualisation of SAR. John Cumming, AstraZeneca, Alderley Park, UK

A Picture Paints a Thousand Words Visualisation of SAR. John Cumming, AstraZeneca, Alderley Park, UK A Picture Paints a Thousand Words Visualisation of AR John Cumming, AstraZeneca, Alderley Park, UK Outline Why visualisation? Multi-parameter visualisation tructure fragmentation Matched pair analysis

More information

Relative Drug Likelihood: Going beyond Drug-Likeness

Relative Drug Likelihood: Going beyond Drug-Likeness Relative Drug Likelihood: Going beyond Drug-Likeness ACS Fall National Meeting, August 23rd 2012 Matthew Segall, Iskander Yusof Optibrium, StarDrop, Auto-Modeller and Glowing Molecule are trademarks of

More information

Bridging the Dimensions:

Bridging the Dimensions: Bridging the Dimensions: Seamless Integration of 3D Structure-based Design and 2D Structure-activity Relationships to Guide Medicinal Chemistry ACS Spring National Meeting. COMP, March 13 th 2016 Marcus

More information

Overview. Descriptors. Definition. Descriptors. Overview 2D-QSAR. Number Vector Function. Physicochemical property (log P) Atom

Overview. Descriptors. Definition. Descriptors. Overview 2D-QSAR. Number Vector Function. Physicochemical property (log P) Atom verview D-QSAR Definition Examples Features counts Topological indices D fingerprints and fragment counts R-group descriptors ow good are D descriptors in practice? Summary Peter Gedeck ovartis Institutes

More information

Alkane/water partition coefficients and hydrogen bonding. Peter Kenny

Alkane/water partition coefficients and hydrogen bonding. Peter Kenny Alkane/water partition coefficients and hydrogen bonding Peter Kenny (pwk.pub.2008@gmail.com) Neglect of hydrogen bond strength: A recurring theme in medicinal chemistry Rule of 5 Rule of 3 Scoring functions

More information

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004

Computational Methods and Drug-Likeness. Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004 Computational Methods and Drug-Likeness Benjamin Georgi und Philip Groth Pharmakokinetik WS 2003/2004 The Problem Drug development in pharmaceutical industry: >8-12 years time ~$800m costs >90% failure

More information

QSAR/QSPR modeling. Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships

QSAR/QSPR modeling. Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships Quantitative Structure-Activity Relationships Quantitative Structure-Property-Relationships QSAR/QSPR modeling Alexandre Varnek Faculté de Chimie, ULP, Strasbourg, FRANCE QSAR/QSPR models Development Validation

More information

Supplementary Figure 1: Chemical compound space. Errors depending on the size of the training set for models with T = 1, 2, 3 interaction passes

Supplementary Figure 1: Chemical compound space. Errors depending on the size of the training set for models with T = 1, 2, 3 interaction passes 9 8 7 6 5 4 3 2 1 0 10 3 10 4 10 5 6 5 4 3 2 1 0 10000 25000 50000 100000 Supplementary Figure 1: Chemical compound space. Errors depending on the size of the training set for models with T = 1, 2, 3 interaction

More information

PHARMA SCIENCE MONITOR AN INTERNATIONAL JOURNAL OF PHARMACEUTICAL SCIENCES

PHARMA SCIENCE MONITOR AN INTERNATIONAL JOURNAL OF PHARMACEUTICAL SCIENCES PHARMA SCIENCE MONITOR AN INTERNATIONAL JOURNAL OF PHARMACEUTICAL SCIENCES LIPOPHILICITY OF FEW CHROMEN-2-ONES AND CHROMENE-2-THIONES BY HYPERCHEM 7.0 AND ALOGPS 2.1 P. Jayanthi* and P. Lalitha Department

More information

Comparison of log P/D with bio-mimetic properties

Comparison of log P/D with bio-mimetic properties Comparison of log P/D with bio-mimetic properties Klara Valko Bio-Mimetic Chromatography Consultancy for Successful Drug Discovery Solvation process First we need to create a cavity Solute molecule Requires

More information

Introduction to Chemoinformatics and Drug Discovery

Introduction to Chemoinformatics and Drug Discovery Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013 The Chemical Space There are atoms and space. Everything else is opinion. Democritus (ca.

More information

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients What our model needs to do regression Usually, we are not just trying to explain observed data We want to uncover meaningful trends And predict future observations Our questions then are Is β" a good estimate

More information

Using AutoDock for Virtual Screening

Using AutoDock for Virtual Screening Using AutoDock for Virtual Screening CUHK Croucher ASI Workshop 2011 Stefano Forli, PhD Prof. Arthur J. Olson, Ph.D Molecular Graphics Lab Screening and Virtual Screening The ultimate tool for identifying

More information

CHAPTER 6 NEURAL NETWORK BASED QUANTITATIVE STRUCTURE-PROPERTY RELATIONS(QSPRS) FOR PREDICTING REFRACTIVE INDEX OF HYDROCARBONS

CHAPTER 6 NEURAL NETWORK BASED QUANTITATIVE STRUCTURE-PROPERTY RELATIONS(QSPRS) FOR PREDICTING REFRACTIVE INDEX OF HYDROCARBONS CHAPTER 6 NEURAL NETWORK BASED QUANTITATIVE STRUCTURE-PROPERTY RELATIONS(QSPRS) FOR PREDICTING REFRACTIVE INDEX OF HYDROCARBONS 6.1 Introduction This chapter describes Quantitative Structure Property Relationship

More information

Machine learning for ligand-based virtual screening and chemogenomics!

Machine learning for ligand-based virtual screening and chemogenomics! Machine learning for ligand-based virtual screening and chemogenomics! Jean-Philippe Vert Institut Curie - INSERM U900 - Mines ParisTech In silico discovery of molecular probes and drug-like compounds:

More information

Structure-Activity Modeling - QSAR. Uwe Koch

Structure-Activity Modeling - QSAR. Uwe Koch Structure-Activity Modeling - QSAR Uwe Koch QSAR Assumption: QSAR attempts to quantify the relationship between activity and molecular strcucture by correlating descriptors with properties Biological activity

More information

Supporting Information

Supporting Information Supporting Information Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction Connor W. Coley a, Regina Barzilay b, William H. Green a, Tommi S. Jaakkola b, Klavs F. Jensen

More information

Introduction. OntoChem

Introduction. OntoChem Introduction ntochem Providing drug discovery knowledge & small molecules... Supporting the task of medicinal chemistry Allows selecting best possible small molecule starting point From target to leads

More information

De Novo molecular design with Deep Reinforcement Learning

De Novo molecular design with Deep Reinforcement Learning De Novo molecular design with Deep Reinforcement Learning @olexandr Olexandr Isayev, Ph.D. University of North Carolina at Chapel Hill olexandr@unc.edu http://olexandrisayev.com About me Ph.D. in Chemistry

More information

Hydrogen Bonding & Molecular Design Peter

Hydrogen Bonding & Molecular Design Peter Hydrogen Bonding & Molecular Design Peter Kenny(pwk.pub.2008@gmail.com) Hydrogen Bonding in Drug Discovery & Development Interactions between drug and water molecules (Solubility, distribution, permeability,

More information

Introduction to Chemoinformatics

Introduction to Chemoinformatics Introduction to Chemoinformatics Dr. Igor V. Tetko Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH) Institute of Bioinformatics & Systems Biology (HMGU) Kyiv, 10 August

More information

Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt

Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt Gaussian Processes: We demand rigorously defined areas of uncertainty and doubt ACS Spring National Meeting. COMP, March 16 th 2016 Matthew Segall, Peter Hunt, Ed Champness matt.segall@optibrium.com Optibrium,

More information

LogP and logd calculations

LogP and logd calculations LogP and logd calculations This background material explains the theory behind the logp and log D calculation: Introduction Symbols used Definition of logp and logd Example Partition and distribution coefficients

More information

OPERA (OPEn sar App) models. 1.QSAR identifier 1.1.QSAR identifier (title): 1.2.Other related models: 1.3.Software coding the model:

OPERA (OPEn sar App) models. 1.QSAR identifier 1.1.QSAR identifier (title): 1.2.Other related models: 1.3.Software coding the model: QMRF identifier (JRC Inventory):To be entered by JRC QMRF Title:WS: Water solubility prediction from OPERA (OPEn sar App) models. Printing Date:Dec 5, 2016 1.QSAR identifier 1.1.QSAR identifier (title):

More information

Medicinal Chemistry/ CHEM 458/658 Chapter 3- SAR and QSAR

Medicinal Chemistry/ CHEM 458/658 Chapter 3- SAR and QSAR Medicinal Chemistry/ CHEM 458/658 Chapter 3- SAR and QSAR Bela Torok Department of Chemistry University of Massachusetts Boston Boston, MA 1 Introduction Structure-Activity Relationship (SAR) - similar

More information

Advanced Medicinal Chemistry SLIDES B

Advanced Medicinal Chemistry SLIDES B Advanced Medicinal Chemistry Filippo Minutolo CFU 3 (21 hours) SLIDES B Drug likeness - ADME two contradictory physico-chemical parameters to balance: 1) aqueous solubility 2) lipid membrane permeability

More information

Assignment 1: Molecular Mechanics (PART 1 25 points)

Assignment 1: Molecular Mechanics (PART 1 25 points) Chemistry 380.37 Fall 2015 Dr. Jean M. Standard August 19, 2015 Assignment 1: Molecular Mechanics (PART 1 25 points) In this assignment, you will perform some molecular mechanics calculations using the

More information

Fragment-based de novo Design

Fragment-based de novo Design ragment-based de novo Design Gisbert Schneider & Uli echner gisbert.schneider@modlab.de www.modlab.de Beilstein Endowed Chair for Cheminformatics Institute of rganic Chemistry & Chemical Biology Johann

More information

Exploring the black box: structural and functional interpretation of QSAR models.

Exploring the black box: structural and functional interpretation of QSAR models. EMBL-EBI Industry workshop: In Silico ADMET prediction 4-5 December 2014, Hinxton, UK Exploring the black box: structural and functional interpretation of QSAR models. (Automatic exploration of datasets

More information

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors

A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors A Tiered Screen Protocol for the Discovery of Structurally Diverse HIV Integrase Inhibitors Rajarshi Guha, Debojyoti Dutta, Ting Chen and David J. Wild School of Informatics Indiana University and Dept.

More information

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics. Plan Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics. Exercise: Example and exercise with herg potassium channel: Use of

More information

ADMET property estimation, oral bioavailability predictions, SAR elucidation, & QSAR model building software www.simulations-plus.com +1-661-723-7723 What is? is an advanced computer program that enables

More information

Scoring functions for of protein-ligand docking: New routes towards old goals

Scoring functions for of protein-ligand docking: New routes towards old goals 3nd Strasbourg Summer School on Chemoinformatics Strasbourg, June 25-29, 2012 Scoring functions for of protein-ligand docking: New routes towards old goals Christoph Sotriffer Institute of Pharmacy and

More information

Translating Methods from Pharma to Flavours & Fragrances

Translating Methods from Pharma to Flavours & Fragrances Translating Methods from Pharma to Flavours & Fragrances CINF 27: ACS National Meeting, New Orleans, LA - 18 th March 2018 Peter Hunt, Edmund Champness, Nicholas Foster, Tamsin Mansley & Matthew Segall

More information

Structural interpretation of QSAR models a universal approach

Structural interpretation of QSAR models a universal approach Methods and Applications of Computational Chemistry - 5 Kharkiv, Ukraine, 1 5 July 2013 Structural interpretation of QSAR models a universal approach Victor Kuz min, Pavel Polishchuk, Anatoly Artemenko,

More information

ASSESSING THE DRUG ABILITY OF CHALCONES USING IN- SILICO TOOLS

ASSESSING THE DRUG ABILITY OF CHALCONES USING IN- SILICO TOOLS Page109 IJPBS Volume 5 Issue 1 JAN-MAR 2015 109-114 Research Article Pharmaceutical Sciences ASSESSING THE DRUG ABILITY OF CHALCONES USING IN- SILICO TOOLS Department of Pharmaceutical Chemistry, Institute

More information

Acids and Bases. Moore, T. (2016). Acids and Bases. Lecture presented at PHAR 422 Lecture in UIC College of Pharmacy, Chicago.

Acids and Bases. Moore, T. (2016). Acids and Bases. Lecture presented at PHAR 422 Lecture in UIC College of Pharmacy, Chicago. Acids and Bases Moore, T. (2016). Acids and Bases. Lecture presented at PHAR 422 Lecture in UIC College of Pharmacy, Chicago. Drug dissolution can impact buffering capacity of the body Most enzymes require

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Estimating the Domain of Applicability for Machine Learning QSAR Models: A Study on Aqueous Solubility of Drug Discovery Molecules

Estimating the Domain of Applicability for Machine Learning QSAR Models: A Study on Aqueous Solubility of Drug Discovery Molecules Estimating the Domain of Applicability for Machine Learning QSAR Models: A Study on Aqueous Solubility of Drug Discovery Molecules The original publication is available at www.springerlink.com http://dx.doi.org/.7/s822-7-925-z

More information

bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012

bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012 bcl::cheminfo Suite Enables Machine Learning-Based Drug Discovery Using GPUs Edward W. Lowe, Jr. Nils Woetzel May 17, 2012 Outline Machine Learning Cheminformatics Framework QSPR logp QSAR mglur 5 CYP

More information

(b) How many hydrogen atoms are in the molecular formula of compound A? [Consider the 1 H NMR]

(b) How many hydrogen atoms are in the molecular formula of compound A? [Consider the 1 H NMR] CHEM 6371/4511 Name: The exam consists of interpretation of spectral data for compounds A-C. The analysis of each structure is worth 33.33 points. Compound A (a) How many carbon atoms are in the molecular

More information

Amorphous Blobs of Hope and Other Flights of Fancy. Steve Muchmore Chemaxon UGM April 18, 2011 Budapest

Amorphous Blobs of Hope and Other Flights of Fancy. Steve Muchmore Chemaxon UGM April 18, 2011 Budapest Amorphous Blobs of Hope and Other Flights of Fancy Steve Muchmore Chemaxon UGM April 18, 2011 Budapest Rules of Thumb help folks make objective and informed decisions in the face of incomplete or inaccurate

More information

Properties of Solutions. Review

Properties of Solutions. Review Properties of Solutions Review Matter Pure substance Mixture of substances compound element homogeneous heterogeneous Solution Definitions A solution is a homogeneous mixture of two or more substances.

More information

Data Mining in the Chemical Industry. Overview of presentation

Data Mining in the Chemical Industry. Overview of presentation Data Mining in the Chemical Industry Glenn J. Myatt, Ph.D. Partner, Myatt & Johnson, Inc. glenn.myatt@gmail.com verview of presentation verview of the chemical industry Example of the pharmaceutical industry

More information

Accurate Solubility Prediction with Error Bars for Electrolytes: A Machine Learning Approach

Accurate Solubility Prediction with Error Bars for Electrolytes: A Machine Learning Approach Accurate Solubility Prediction with Error Bars for Electrolytes: A Machine Learning Approach J. Chem. Inf. Model., 47,, 47-44, 7 DRAFT paper, find the original at http://dx.doi.org/1.11/ci65g Anton Schwaighofer

More information

C. Correct! The abbreviation Ar stands for an aromatic ring, sometimes called an aryl ring.

C. Correct! The abbreviation Ar stands for an aromatic ring, sometimes called an aryl ring. Organic Chemistry - Problem Drill 05: Drawing Organic Structures No. 1 of 10 1. What does the abbreviation Ar stand for? (A) Acetyl group (B) Benzyl group (C) Aromatic or Aryl group (D) Benzoyl group (E)

More information

Interpreting Computational Neural Network QSAR Models: A Detailed Interpretation of the Weights and Biases

Interpreting Computational Neural Network QSAR Models: A Detailed Interpretation of the Weights and Biases 241 Chapter 9 Interpreting Computational Neural Network QSAR Models: A Detailed Interpretation of the Weights and Biases 9.1 Introduction As we have seen in the preceding chapters, interpretability plays

More information

Introduction to the Renormalization Group

Introduction to the Renormalization Group Introduction to the Renormalization Group Gregory Petropoulos University of Colorado Boulder March 4, 2015 1 / 17 Summary Flavor of Statistical Physics Universality / Critical Exponents Ising Model Renormalization

More information

Quantum Monte Carlo Study of the Equation of State and Vibrational Frequency Shifts of Solid para-hydrogen

Quantum Monte Carlo Study of the Equation of State and Vibrational Frequency Shifts of Solid para-hydrogen Quantum Monte Carlo Study of the Equation of State and Vibrational Frequency Shifts of Solid para-hydrogen Lecheng Wang, Robert J. Le Roy and Pierre-Nicholas Roy Chemistry Department, University of Waterloo

More information

molecules ISSN

molecules ISSN Molecules 2004, 9, 1004-1009 molecules ISSN 1420-3049 http://www.mdpi.org Performance of Kier-Hall E-state Descriptors in Quantitative Structure Activity Relationship (QSAR) Studies of Multifunctional

More information

Computation of Octanol-Water Partition Coefficients by Guiding an Additive Model with Knowledge

Computation of Octanol-Water Partition Coefficients by Guiding an Additive Model with Knowledge SUPPORTING INFORMATION Computation of Octanol-Water Partition Coefficients by Guiding an Additive Model with Knowledge Tiejun Cheng, Yuan Zhao, Xun Li, Fu Lin, Yong Xu, Xinglong Zhang, Yan Li and Renxiao

More information

A QSPR STUDY OF NORMAL BOILING POINT OF ORGANIC COMPOUNDS (ALIPHATIC ALKANES) USING MOLECULAR DESCRIPTORS. B. Souyei * and M.

A QSPR STUDY OF NORMAL BOILING POINT OF ORGANIC COMPOUNDS (ALIPHATIC ALKANES) USING MOLECULAR DESCRIPTORS. B. Souyei * and M. Journal of Fundamental and Applied Sciences ISSN 1112-987 Available online at http://www.jfas.info A QSPR STUDY OF NORMAL BOILING POINT OF ORGANIC COMPOUNDS (ALIPHATIC ALKANES) USING MOLECULAR DESCRIPTORS

More information

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part I: Motivation, Basics, Descriptors

Drug Design 2. Oliver Kohlbacher. Winter 2009/ QSAR Part I: Motivation, Basics, Descriptors Drug Design 2 Oliver Kohlbacher Winter 2009/2010 8. QSAR Part I: Motivation, Basics, Descriptors Abt. Simulation biologischer Systeme WSI/ZBIT, Eberhard Karls Universität Tübingen Overview Attrition rate

More information

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner

Classifier Selection. Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Selection Nicholas Ver Hoeve Craig Martek Ben Gardner Classifier Ensembles Assume we have an ensemble of classifiers with a well-chosen feature set. We want to optimize the competence of this

More information

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs William (Bill) Welsh welshwj@umdnj.edu Prospective Funding by DTRA/JSTO-CBD CBIS Conference 1 A State-wide, Regional and National

More information

Simultaneous polymer property modeling using Grid technology for structured products

Simultaneous polymer property modeling using Grid technology for structured products 17 th European Symposium on Computer Aided Process Engineering ESCAPE17 V. Plesu and P.S. Agachi (Editors) 2007 Elsevier B.V. All rights reserved. 1 Simultaneous polymer property modeling using Grid technology

More information

* ChemQuest. Neural Network Studies. 1. Comparison of Overfitting and Overtraining. x(yi - o,)*

* ChemQuest. Neural Network Studies. 1. Comparison of Overfitting and Overtraining. x(yi - o,)* 826 J. Chem. In$ Comput. Sci. 1995, 35, 826-833 Neural Network Studies. 1. Comparison of Overfitting and Overtraining Igor V. Tetko,? David J. Livingstone, @ and Alexander I. Luik*,+ Biomedical Department,

More information

Comparative Analysis of Machine Learning Techniques for the Prediction of LogP

Comparative Analysis of Machine Learning Techniques for the Prediction of LogP Comparative Analysis of Machine Learning Techniques for the Prediction of LogP Edward W. Lowe, Jr., Mariusz Butkiewicz, Matthew Spellings, Albert Omlor, Jens Meiler Abstract Several machine learning techniques

More information

Learning Molecular Fingerprints from the Graph Up

Learning Molecular Fingerprints from the Graph Up Learning Molecular Fingerprints from the Graph Up David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, Ryan P. Adams Motivation Want

More information

Lead Optimization Studies on Novel Quinolones Derivatives as CYP-450 Inhibitor by using In-Silico Modulation

Lead Optimization Studies on Novel Quinolones Derivatives as CYP-450 Inhibitor by using In-Silico Modulation World Journal of Pharmaceutical Sciences SS (Print): 2321-3310; SS (Online): 2321-3086 Available online at: http://www.wjpsonline.org/ Original Article Lead Optimization Studies on ovel Quinolones Derivatives

More information

Development of K-step Yard sampling method and Apply to the ADME-T In Silico Screening. In Silico Data, Ltd.

Development of K-step Yard sampling method and Apply to the ADME-T In Silico Screening. In Silico Data, Ltd. Development of K-step Yard sampling method and Apply to the ADME-T In ilico creening Kohtaro Yuta In ilico Data, Ltd. 1. Toxicity prediction and Pattern recognition (PR) 2. General features of data analysis

More information

ALCOHOLS AND PHENOLS

ALCOHOLS AND PHENOLS ALCOHOLS AND PHENOLS ALCOHOLS AND PHENOLS Alcohols contain an OH group connected to a a saturated C (sp3) They are important solvents and synthesis intermediates Phenols contain an OH group connected to

More information

Chapter 15 Amines Amines 15.2 Naming Amines 15.3 Physical Properties of Amines

Chapter 15 Amines Amines 15.2 Naming Amines 15.3 Physical Properties of Amines Chapter 15 Amines 15.1 Amines 15.2 Naming Amines 15.3 Physical Properties of Amines 1 Amines Amines are organic compounds in which one or more H in ammonia, NH 3, is replaced with alkyl or aromatic groups.

More information

descriptors in QSAR of multi-functional molecules

descriptors in QSAR of multi-functional molecules Performance of Kier-Hall E-states E descriptors in QSAR of multi-functional molecules Darko Butina ChemoMine Consultancy ChemoMine 1 Kier-Hall E-state E descriptors Pharm.Res. 1990, 7, 801-807 JCICS 1991,

More information

An Integrated Approach to in-silico

An Integrated Approach to in-silico An Integrated Approach to in-silico Screening Joseph L. Durant Jr., Douglas. R. Henry, Maurizio Bronzetti, and David. A. Evans MDL Information Systems, Inc. 14600 Catalina St., San Leandro, CA 94577 Goals

More information

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics... 1 1.1 Chemoinformatics... 2 1.1.1 Open-Source Tools... 2 1.1.2 Introduction to Programming Languages... 3 1.2 Chemical Structure

More information

Machine Learning Lecture 3

Machine Learning Lecture 3 Announcements Machine Learning Lecture 3 Eam dates We re in the process of fiing the first eam date Probability Density Estimation II 9.0.207 Eercises The first eercise sheet is available on L2P now First

More information

In silico ADME/Tox: Why models fail. Why models work

In silico ADME/Tox: Why models fail. Why models work In silico ADME/Tox: Why models fail: Why models work Terry Richard Stouch, PhD Consulting in Drug Discovery and Design Practice, Technologies, Process Princeton, NJ and Duquesne University June 2010 National

More information

The use of Design of Experiments to develop Efficient Arrays for SAR and Property Exploration

The use of Design of Experiments to develop Efficient Arrays for SAR and Property Exploration The use of Design of Experiments to develop Efficient Arrays for SAR and Property Exploration Chris Luscombe, Computational Chemistry GlaxoSmithKline Summary of Talk Traditional approaches SAR Free-Wilson

More information

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre

Dr. Sander B. Nabuurs. Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre Dr. Sander B. Nabuurs Computational Drug Discovery group Center for Molecular and Biomolecular Informatics Radboud University Medical Centre The road to new drugs. How to find new hits? High Throughput

More information

Lone Pairs: An Electrostatic Viewpoint

Lone Pairs: An Electrostatic Viewpoint S1 Supporting Information Lone Pairs: An Electrostatic Viewpoint Anmol Kumar, Shridhar R. Gadre, * Neetha Mohan and Cherumuttathu H. Suresh * Department of Chemistry, Indian Institute of Technology Kanpur,

More information

Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction

Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.

More information

AUTOMATIC GENERATION OF TAUTOMERS

AUTOMATIC GENERATION OF TAUTOMERS ПЛОВДИВСКИ УНИВЕРСИТЕТ ПАИСИЙ ХИЛЕНДАРСКИ БЪЛГАРИЯ НАУЧНИ ТРУДОВЕ, ТОМ 38, КН. 5, 2011 ХИМИЯ UNIVERSITY OF PLOVDIV PAISII HILENDARSKI BULGARIA SCIENTIFIC PAPERS, VOL. 38, BOOK 5, 2011 CHEMISTRY AUTOMATIC

More information

Organic Chemistry II (CHE ) Examination I March 1, Name (Print legibly): _KEY _. Student ID#: _

Organic Chemistry II (CHE ) Examination I March 1, Name (Print legibly): _KEY _. Student ID#: _ Organic Chemistry II (CHE 232-002) Examination I March 1, 2007 Name (Print legibly): _KEY _ (last) (first) Student ID#: _ PLEASE observe the following: You are allowed to have scratch paper (provided by

More information

Organic Chemistry II* (CHE ) Examination I March 1, Name (Print legibly): KEY. Student ID#:

Organic Chemistry II* (CHE ) Examination I March 1, Name (Print legibly): KEY. Student ID#: Organic Chemistry II* (CHE 232-002) Examination I March 1, 2007 Name (Print legibly): KEY (last) (first) Student ID#: PLEASE observe the following: You are allowed to have scratch paper (provided by me),

More information

Structural biology and drug design: An overview

Structural biology and drug design: An overview Structural biology and drug design: An overview livier Taboureau Assitant professor Chemoinformatics group-cbs-dtu otab@cbs.dtu.dk Drug discovery Drug and drug design A drug is a key molecule involved

More information

A Practitioner s Guide to Cluster-Robust Inference

A Practitioner s Guide to Cluster-Robust Inference A Practitioner s Guide to Cluster-Robust Inference A. C. Cameron and D. L. Miller presented by Federico Curci March 4, 2015 Cameron Miller Cluster Clinic II March 4, 2015 1 / 20 In the previous episode

More information

Estimating Predictive Uncertainty for Ensemble Regression Models by Gamma Error Analysis

Estimating Predictive Uncertainty for Ensemble Regression Models by Gamma Error Analysis Estimating Predictive Uncertainty for Ensemble Regression Models by Gamma Error Analysis Bob Clark & Marvin Waldman Simulations Plus, Inc. Lancaster CA USA bob@simulations-plus.com EuroQSAR 2018 The Standard

More information

CHEMISTRY. Section II (Total time 95 minutes) Part A Time 55 minutes YOU MAY USE YOUR CALCULATOR FOR PART A.

CHEMISTRY. Section II (Total time 95 minutes) Part A Time 55 minutes YOU MAY USE YOUR CALCULATOR FOR PART A. CHEMISTRY Section II (Total time 95 minutes) Part A Time 55 minutes YOU MAY USE YOUR CALCULATOR FOR PART A. CLEARLY SHOW THE METHOD USED AND THE STEPS INVOLVED IN ARRIVING AT YOUR ANSWERS. It is to your

More information

Open PHACTS Explorer: Compound by Name

Open PHACTS Explorer: Compound by Name Open PHACTS Explorer: Compound by Name This document is a tutorial for obtaining compound information in Open PHACTS Explorer (explorer.openphacts.org). Features: One-click access to integrated compound

More information

Chemistry 1110 Exam 4 Study Guide

Chemistry 1110 Exam 4 Study Guide Chapter 10 Chemistry 1110 Exam 4 Study Guide 10.1 Know that unstable nuclei can undergo radioactive decay. Identify alpha particles, beta particles, and/or gamma rays based on physical properties such

More information

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery Building innovative drug discovery alliances Just in KIME: Successful Process Driven Drug Discovery Berlin KIME Spring Summit, Feb 2016 Research Informatics @ Evotec Evotec s worldwide operations 2 Pharmaceuticals

More information

(Big) Data analysis using On-line Chemical database and Modelling platform. Dr. Igor V. Tetko

(Big) Data analysis using On-line Chemical database and Modelling platform. Dr. Igor V. Tetko (Big) Data analysis using On-line Chemical database and Modelling platform Dr. Igor V. Tetko Institute of Structural Biology, Helmholtz Zentrum München & BIGCHEM GmbH September 14, 2018, EPFL, Lausanne

More information

The Calculation of Physical Properties of Amino Acids Using Molecular Modeling Techniques (II)

The Calculation of Physical Properties of Amino Acids Using Molecular Modeling Techniques (II) 1046 Bull. Korean Chem. Soc. 2004, Vol. 25, No. 7 Myung-Jae Lee and Ui-Rak Kim The Calculation of Physical Properties of Amino Acids Using Molecular Modeling Techniques (II) Myung-Jae Lee and Ui-Rak Kim

More information

Benzedrine is a pharmaceutical which stimulates the central nervous system in a similar manner to adrenalin. Benzedrine Adrenalin

Benzedrine is a pharmaceutical which stimulates the central nervous system in a similar manner to adrenalin. Benzedrine Adrenalin 1. Adrenalin is a hormone which raises blood pressure, increases the depth of breathing and delays fatigue in muscles, thus allowing people to show great strength under stress. Benzedrine is a pharmaceutical

More information