Command-line tools of ChemAxon: tips and tricks György Pirok Solutions for Cheminformatics Command-line interface A command-line interface (CLI) is a mechanism for interacting with a computer operating system or software by typing commands to perform specific tasks. (Wikipedia)
Command line tools msketch mview molconvert convert2image cxcalc evaluate jcsearch standardize react msketch The msketch tool starts MarvinSketch applications opening a selected molecule of the file specified as input parameter. mview caffeine.mol msketch caffeine.mol
mview The mview tool starts MarvinView application opening the file specified as input parameter. mview caffeine.mol mview dsstox.sdf mview The layout of molecules in mview can be customized by command line parameters. The example below forces to open and SDfile in a matrix view. mview dsstox.sdf --gridbag It is possible to display only a part of a large file. mview nci.smiles s 15000 n 400 The number of displayed columns and rows can be set as parameters as well. mview nci.smiles c 10 r 10 These settings are particularly useful when molecules are piped into mview for display as it will be shown later.
molconvert The molconvert script can be used to convert a structure file to an other format. molconvert mrv caffeine.mol o caffeine.mrv Merge the molecules of the Molfiles in the current directory to a single SMILES file. The structures are aromatized and explicit hydrogens are removed. molconvert smiles:a-h *.mol o output.smiles Merge the SMILES files in the current directory to a single SDfile. The structures are dearomatized, and 2D atom coordinates are calculated. molconvert -2 sdf:-a *.smiles o output.sdf molconvert, convert2image Generate jpeg image in 100x150 resolution having yellow background. Many options are available to customize the generated molecule image. molconvert jpeg caffeine.mol o caffeine.jpg Batch image generation is possible with the convert2image script that creates a series of numbered images. The script is downloadable from the ChemAxon forum. convert2image "jpeg:w300,h300,mono" molecules.sdf
cxcalc The cxcalc command-line application provides access to plugin functions. The first example below shows general help and lists all available calculations, the second displays calculation specific help. cxcalc -h cxcalc logd -h Calculated values of a molecule file are usually printed in the form of an indexed table, but the index and table headers can be turned off. cxcalc pka in.mrv cxcalc -N ih logp in.sdf cxcalc Enumerate random members of a Markush library. cxcalc -N ih randommarkushenumerations -m 50 markush.mrv Calculate the lowest energy conformer of each molecule of a large file in batch mode and display the results in MarvinView during the calculation. cxcalc lowestenergyconformer in.smiles mview --gridbag - Determine the IUPAC names of the molecules and store them as SDfile fields. cxcalc -S -t NAME -o out.sdf name in.smiles
evaluate The evaluate script provides a command line interface for complex calculations via the Chemical Terms language of ChemAxon. More than a hundred functions can be combined to examine chemical compounds. Determine the number of non-heteroaromatic rings. evaluate e 'ringcount() heteroaromaticringcount()' in.mrv Calculate an indicator for scaffold hopping. evaluate e 'dissimilarity("chemicalfingerprint", actives) - dissimilarity("pharmacophorefingerprint", actives) > 0.5' in.mrv Calculate Lipinski's oral drug-likeness flag for each molecule. evaluate e '(mass() <= 500) and (logp() <= 5) and (donorcount() <= 5) and (acceptorcount() <= 10)' in.mrv jcsearch The jcsearch program is a versatile command-line interface for structure search functions and it works both with files and databases. Query is specified as q option. jcsearch q "c1ccccc1cl" target.sdf In addition to substructure search, full, full fragment, duplicate, similarity, and superstructure search types are supported as well. Find chlorobenzenes in a file with duplicate search. jcsearch t:d q "c1ccccc1cl" target.smiles Search for a molecule specified as a query file and its tautomers in a database table. jcsearch t:p q input.mol DB:reagents mview -
jcsearch Find molecules similar to the one given as query and display results in MarvinView. jcsearch t:i:0.3 "c1ccccc1cl" nci.smiles mview - Perform reaction search or similarity search with reaction queries. jcsearch q decarboxylation.rxn reactions.rdf Count molecules containing nitro groups attached. jcsearch t:c q 'O=N[O-]' nci.smiles Retrieve acetylenes containing more than two amino groups. jcsearch q "[CX2:1]#[CX2:1]" e 'matchcount(amine) > 2' nci.smiles Find molecules containing carboxyl group having a given pk a value. jcsearch -e "pka('acidic',hm(1)) > 4" -q "[H][O:1]C=[O:2]" nci25k.smiles standardize Standardizer converts molecules by a list of actions and is usually used as a molecule canonicalization engine integrated with databases. However, the standardize command-line tool provides and easy to use batch conversion interface for files. Merge aromatized molecules of multiple files into a single SMILES file. standardize *.sdf *.mrv -c 'aromatize' o output.smiles Remove small components (counterions) and neutralize the remaining molecules. Actions can be listed by two periods as separator. standardize in.mrv -c 'keepone..neutralize' f mrv o out.mrv Convert all nitro groups to the ionic form. standardize in.sdf -c '[O:3]=[N:1]=[O:2]>>[O-:3][N+:1]=[O:2]'
standardize Map reactions by an MCS-based mapping algorithm. standardize in.smiles -c 'mapreaction' o out.smiles Convert alias atoms to abbreviated groups by the alias labels and ungroup them after. standardize in.sdf -c 'aliastogroup..sgroups:expand' f sdf o out.sdf Canonicalize molecules acconding to a list of actions specified in an XML configuration file. standardize in.mrv -c config.xml f sdf o out.sdf Retrieve core ring systems using two transforms and pipe the results to MarvinView for display. standardize in.smiles -c '[*R0:1]>>..[*:1]!@[*:2]>>[*:2].[*:1]' mview - react Generate reaction products from reactants using virtual reactions with the react program can be used to convert a structure file format to another. Reaction is specified as the r option. $ react -r '[H:2][C:1]=[O:3]>>[H:2][C:1][O:3][H:4]' "O=Cc1ccccc1" OCc1ccccc1 IUPAC and traditional names are handled by ChemAxon tools as a native format providing outstanding usability for chemists. react r esterification.smarts '2-hydroxybenzoicacid' 'acetic acid' aspirin Combinatorial libraries can be produced by multimolecular reactions using the reactants from files. react m comb r acyl.rxn alcohols.sdf acidhalides.smiles -o esters.smiles
Summary Command-line interfaces are high performance applications and are available for all ChemAxon products. Only few could be mentioned here. They serve as easy to use tools for those who work with structure files and they shine the most when used in batch mode. Find out more Product descriptions & links www.chemaxon.com/products.html Forum www.chemaxon.com/forum Presentations and posters www.chemaxon.com/conf Download www.chemaxon.com/download.html