Command-line tools of ChemAxon: tips and tricks

Similar documents
Pipeline Pilot Integration

Pipeline Pilot Integration

Marvin. Sketching, viewing and predicting properties with Marvin - features, tips and tricks. Gyorgy Pirok. Solutions for Cheminformatics

ChemAxon. Content. By György Pirok. D Standardization D Virtual Reactions. D Fragmentation. ChemAxon European UGM Visegrad 2008

ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009

Methods for tautomer enumeration, -searching and -duplicate filtering

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

NMR Predictor. Introduction

The Electronic Representation of Chemical Structures: beyond the low hanging fruit

Tautomerism in chemical information management systems

Analyzing Small Molecule Data in R

Bioinformatics Workshop - NM-AIST

Introduction to Chemoinformatics and Drug Discovery

Large Scale Evaluation of Chemical Structure Recognition 4 th Text Mining Symposium in Life Sciences October 10, Dr.

Reaxys Pipeline Pilot Components Installation and User Guide

DECEMBER 2014 REAXYS R201 ADVANCED STRUCTURE SEARCHING

Information Extraction from Chemical Images. Discovery Knowledge & Informatics April 24 th, Dr. Marc Zimmermann

On InChI and evaluating the quality of cross-reference links

Ákos Tarcsay CHEMAXON SOLUTIONS

How to add your reactions to generate a Chemistry Space in KNIME

Introduction to Chemoinformatics

Imago: open-source toolkit for 2D chemical structure image recognition

Chemical Databases: Encoding, Storage and Search of Chemical Structures

The Schrödinger KNIME extensions

Representation of molecular structures. Coutersy of Prof. João Aires-de-Sousa, University of Lisbon, Portugal

Chemical Ontologies. Chemical Ontologies. ChemAxon UGM May 23, 2012

Ligand Scout Tutorials

Reaxys The Highlights

Structural biology and drug design: An overview

The Fragment Network: A Chemistry Recommendation Engine Built Using a Graph Database

Tutorials on Library Design E. Lounkine and J. Bajorath (University of Bonn) C. Muller and A. Varnek (University of Strasbourg)

Marvin 5.4 A new generation of structure indexing at Elsevier. Dr. Michael Maier, Dr. Heike Nau, Elsevier

Open PHACTS Explorer: Compound by Name

Navigation in Chemical Space Towards Biological Activity. Peter Ertl Novartis Institutes for BioMedical Research Basel, Switzerland

Canonical Line Notations

Dock Ligands from a 2D Molecule Sketch

Integrated Cheminformatics to Guide Drug Discovery

The Schrödinger KNIME extensions

Dictionary of ligands

ADRIANA.Code and SONNIA. Tutorial

CDK & Mass Spectrometry

Build_model v User Guide

Accelerated computational discovery of high-performance materials for organic photovoltaics by means of cheminformatics

How to Create a Substance Answer Set

POC via CHEMnetBASE for Identifying Unknowns

Chemical File Format Conversion Tools : A n Overview

ISIS/Draw "Quick Start"

BIOVIA ENHANCED STEREOCHEMICAL REPRESENTATION WHITE PAPER

GIS Software. Evolution of GIS Software

POC via CHEMnetBASE for Identifying Unknowns

AUTOMATIC GENERATION OF TAUTOMERS

Tutorial. Getting started. Sample to Insight. March 31, 2016

InChI keys as standard global identifiers in chemistry web services. Russ Hillard ACS, Salt Lake City March 2009

BioSolveIT. A Combinatorial Approach for Handling of Protonation and Tautomer Ambiguities in Docking Experiments

Introduction Molecular Structure Script Console External resources Advanced topics. JMol tutorial. Giovanni Morelli.

PubChem data extraction and integration using Instant JChem. Oleg Ursu Cristian Bologa Tudor I. Oprea Division of Biocomputing

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Organometallics & InChI. August 2017

Expanding the scope of literature data with document to structure tools PatentInformatics applications at Aptuit

Iowa Department of Transportation Office of Transportation Data GIS / CAD Integration

FROM MOLECULAR FORMULAS TO MARKUSH STRUCTURES

The IUPAC Chemical Identifier

Reaxys Managing Complexity

Coot Updates. Paul Emsley Sept 2016

LogP and logd calculations

Handling Human Interpreted Analytical Data. Workflows for Pharmaceutical R&D. Presented by Peter Russell

Plan. Day 2: Exercise on MHC molecules.

CSD. Unlock value from crystal structure information in the CSD

The Schrödinger KNIME extensions

Comprehensive Chemoinformatics since Web-based, client/server, and toolkit approaches. Native Oracle (cartridge) and Microsoft technology.

Environmental Systems Research Institute

BioSolveIT. A Combinatorial Docking Approach for Dealing with Protonation and Tautomer Ambiguities

Developing CAS Products for Substructure Searching by Chemists. Linda Toler

Structure and Reaction querying in Reaxys

Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule

Relative Drug Likelihood: Going beyond Drug-Likeness

Farewell, PipelinePilot Migrating the Exquiron cheminformatics platform to KNIME and the ChemAxon technology

Catching the Drift Indexing Implicit Knowledge in Chemical Digital Libraries

New challenges in the management of chemical information. Service of cheminformatics

Analytical data, the web, and standards for unified laboratory informatics databases

DOCKING TUTORIAL. A. The docking Workflow

Table of Contents. Scope of the Database 3 Searching by Structure 3. Searching by Substructure 4. Searching by Text 11

Chemical Reactions. Chapter 17

Introduction to Spark

User Guide for LeDock

Performing a Pharmacophore Search using CSD-CrossMiner

Aurora Costache, PhD. CHEMAXON PORTFOLIO WALK THROUGH From toolkits to end-user applications to deliver solutions

Practical QSAR and Library Design: Advanced tools for research teams

Mining Molecular Fragments: Finding Relevant Substructures of Molecules

Preparing a PDB File

ICM-Chemist-Pro How-To Guide. Version 3.6-1h Last Updated 12/29/2009

Molecular Modelling. Computational Chemistry Demystified. RSC Publishing. Interprobe Chemical Services, Lenzie, Kirkintilloch, Glasgow, UK

Part 6. 3D Pharmacophore Modeling

CSD Conformer Generator User Guide

Fast similarity searching making the virtual real. Stephen Pickett, GSK

Garib N Murshudov MRC-LMB, Cambridge

Searching CrossFire Beilstein Using DiscoveryGate. DiscoveryGate Version 2.2 Participant s Guide

Computational Chemistry in Drug Design. Xavier Fradera Barcelona, 17/4/2007

Standardization of a Primary Standard & Determination of Concentration by Acid-Base Titration

CSD Conformer Generator User Guide

Transcription:

Command-line tools of ChemAxon: tips and tricks György Pirok Solutions for Cheminformatics Command-line interface A command-line interface (CLI) is a mechanism for interacting with a computer operating system or software by typing commands to perform specific tasks. (Wikipedia)

Command line tools msketch mview molconvert convert2image cxcalc evaluate jcsearch standardize react msketch The msketch tool starts MarvinSketch applications opening a selected molecule of the file specified as input parameter. mview caffeine.mol msketch caffeine.mol

mview The mview tool starts MarvinView application opening the file specified as input parameter. mview caffeine.mol mview dsstox.sdf mview The layout of molecules in mview can be customized by command line parameters. The example below forces to open and SDfile in a matrix view. mview dsstox.sdf --gridbag It is possible to display only a part of a large file. mview nci.smiles s 15000 n 400 The number of displayed columns and rows can be set as parameters as well. mview nci.smiles c 10 r 10 These settings are particularly useful when molecules are piped into mview for display as it will be shown later.

molconvert The molconvert script can be used to convert a structure file to an other format. molconvert mrv caffeine.mol o caffeine.mrv Merge the molecules of the Molfiles in the current directory to a single SMILES file. The structures are aromatized and explicit hydrogens are removed. molconvert smiles:a-h *.mol o output.smiles Merge the SMILES files in the current directory to a single SDfile. The structures are dearomatized, and 2D atom coordinates are calculated. molconvert -2 sdf:-a *.smiles o output.sdf molconvert, convert2image Generate jpeg image in 100x150 resolution having yellow background. Many options are available to customize the generated molecule image. molconvert jpeg caffeine.mol o caffeine.jpg Batch image generation is possible with the convert2image script that creates a series of numbered images. The script is downloadable from the ChemAxon forum. convert2image "jpeg:w300,h300,mono" molecules.sdf

cxcalc The cxcalc command-line application provides access to plugin functions. The first example below shows general help and lists all available calculations, the second displays calculation specific help. cxcalc -h cxcalc logd -h Calculated values of a molecule file are usually printed in the form of an indexed table, but the index and table headers can be turned off. cxcalc pka in.mrv cxcalc -N ih logp in.sdf cxcalc Enumerate random members of a Markush library. cxcalc -N ih randommarkushenumerations -m 50 markush.mrv Calculate the lowest energy conformer of each molecule of a large file in batch mode and display the results in MarvinView during the calculation. cxcalc lowestenergyconformer in.smiles mview --gridbag - Determine the IUPAC names of the molecules and store them as SDfile fields. cxcalc -S -t NAME -o out.sdf name in.smiles

evaluate The evaluate script provides a command line interface for complex calculations via the Chemical Terms language of ChemAxon. More than a hundred functions can be combined to examine chemical compounds. Determine the number of non-heteroaromatic rings. evaluate e 'ringcount() heteroaromaticringcount()' in.mrv Calculate an indicator for scaffold hopping. evaluate e 'dissimilarity("chemicalfingerprint", actives) - dissimilarity("pharmacophorefingerprint", actives) > 0.5' in.mrv Calculate Lipinski's oral drug-likeness flag for each molecule. evaluate e '(mass() <= 500) and (logp() <= 5) and (donorcount() <= 5) and (acceptorcount() <= 10)' in.mrv jcsearch The jcsearch program is a versatile command-line interface for structure search functions and it works both with files and databases. Query is specified as q option. jcsearch q "c1ccccc1cl" target.sdf In addition to substructure search, full, full fragment, duplicate, similarity, and superstructure search types are supported as well. Find chlorobenzenes in a file with duplicate search. jcsearch t:d q "c1ccccc1cl" target.smiles Search for a molecule specified as a query file and its tautomers in a database table. jcsearch t:p q input.mol DB:reagents mview -

jcsearch Find molecules similar to the one given as query and display results in MarvinView. jcsearch t:i:0.3 "c1ccccc1cl" nci.smiles mview - Perform reaction search or similarity search with reaction queries. jcsearch q decarboxylation.rxn reactions.rdf Count molecules containing nitro groups attached. jcsearch t:c q 'O=N[O-]' nci.smiles Retrieve acetylenes containing more than two amino groups. jcsearch q "[CX2:1]#[CX2:1]" e 'matchcount(amine) > 2' nci.smiles Find molecules containing carboxyl group having a given pk a value. jcsearch -e "pka('acidic',hm(1)) > 4" -q "[H][O:1]C=[O:2]" nci25k.smiles standardize Standardizer converts molecules by a list of actions and is usually used as a molecule canonicalization engine integrated with databases. However, the standardize command-line tool provides and easy to use batch conversion interface for files. Merge aromatized molecules of multiple files into a single SMILES file. standardize *.sdf *.mrv -c 'aromatize' o output.smiles Remove small components (counterions) and neutralize the remaining molecules. Actions can be listed by two periods as separator. standardize in.mrv -c 'keepone..neutralize' f mrv o out.mrv Convert all nitro groups to the ionic form. standardize in.sdf -c '[O:3]=[N:1]=[O:2]>>[O-:3][N+:1]=[O:2]'

standardize Map reactions by an MCS-based mapping algorithm. standardize in.smiles -c 'mapreaction' o out.smiles Convert alias atoms to abbreviated groups by the alias labels and ungroup them after. standardize in.sdf -c 'aliastogroup..sgroups:expand' f sdf o out.sdf Canonicalize molecules acconding to a list of actions specified in an XML configuration file. standardize in.mrv -c config.xml f sdf o out.sdf Retrieve core ring systems using two transforms and pipe the results to MarvinView for display. standardize in.smiles -c '[*R0:1]>>..[*:1]!@[*:2]>>[*:2].[*:1]' mview - react Generate reaction products from reactants using virtual reactions with the react program can be used to convert a structure file format to another. Reaction is specified as the r option. $ react -r '[H:2][C:1]=[O:3]>>[H:2][C:1][O:3][H:4]' "O=Cc1ccccc1" OCc1ccccc1 IUPAC and traditional names are handled by ChemAxon tools as a native format providing outstanding usability for chemists. react r esterification.smarts '2-hydroxybenzoicacid' 'acetic acid' aspirin Combinatorial libraries can be produced by multimolecular reactions using the reactants from files. react m comb r acyl.rxn alcohols.sdf acidhalides.smiles -o esters.smiles

Summary Command-line interfaces are high performance applications and are available for all ChemAxon products. Only few could be mentioned here. They serve as easy to use tools for those who work with structure files and they shine the most when used in batch mode. Find out more Product descriptions & links www.chemaxon.com/products.html Forum www.chemaxon.com/forum Presentations and posters www.chemaxon.com/conf Download www.chemaxon.com/download.html