The Fragment Network: A Chemistry Recommendation Engine Built Using a Graph Database
|
|
- Roderick Gordon
- 5 years ago
- Views:
Transcription
1 Supporting Information The Fragment Network: A Chemistry Recommendation Engine Built Using a Graph Database Richard J. Hall, Christopher W. Murray and Marcel L. Verdonk. Contents This supporting information contains a detailed description of the algorithm to generate the nodes and edges of the fragment network. The document also contains details on inserting nodes and edges into a Neo4j graph database and some example network queries. The additional supporting information contains data that can be inserted into a Neo4j graph database to create an illustrative Fragment Network that is based around the 4-hydroxy-biphenyl example presented in Figure 2 of the manuscript. S1
2 Generating Nodes and Edges for the Fragment Network To generate a set of nodes and edges for a molecule, we first generate a node that represents the molecule itself. The molecule is represented as a nonisomeric smiles string. As well as the smiles string, the heavy atom count (HAC) and ring atom count (RAC) are stored as node attributes, as is a simplified representation of the molecular graph. This simplified graph representation uses the daylight function dt_molgraph to set all bond orders to one, remove aromaticity, set the hydrogen count, remove charges and set masses to zero. Additionally we set the element type of every ring atom to carbon. For 4-hydroxy-biphenyl, this first node has the following data {SMILES: "Oc1ccc(cc1)c2ccccc2", HAC: 13, RAC: 12 RINGSMILES: "OC1CCC(CC1)C2CCCCC2"} (1) Next, a set of molecular components is generated. The SMARTS pattern "[*;R]-;!@[*]" is used to find single acyclic bonds to a ring atom. For each matching bond b, the start and end atom are determined. The bond b is deleted and isotopically labelled xenon atoms are bonded to the start and end atom. The isotopic label is incremented for each matching bond, to provide unique labelling. The components for 4-hydroxy-biphenyl are ['O[100Xe]', '[101Xe]c1ccccc1', '[100Xe]c1ccc([101Xe])cc1'] (2) The bond between the hydroxyl and the phenyl ring is broken and the oxygen and carbon atoms are attached to xenon atoms with the isotopic label 100. The bond between the phenyl rings is broken and the carbon atoms are attached to xenon atoms with the isotopic label 101. Next, this set of molecular components is combined to generate child nodes. Each child node is created by leaving out one of the components. For example, leaving out the O[100Xe] component from (2), we create a child node by considering the two phenyl ring components. These components are combined by locating pairs of matching xenon atom labels. The two phenyl ring components contain one pair of 101Xe labels, as well as an unmatched 100Xe label. A bond is added between the atoms attached to the paired xenon atoms. The bonds to the paired xenon atoms and the xenon atoms themselves are then deleted. The rebuilt molecule will contain the unmatched xenon atoms that indicate the attachment point(s) of the excluded component. In the example there is a single xenon atom that shows the location of the excluded hydroxyl component. If the excluded component is a ring or linker the rebuilt molecule will contain multiple xenon atoms and will be disconnected. We generate a smiles string for the child node by replacing the remaining xenon atom with hydrogen. For the components in (2), the combinations are given in table S1. Exclude Leaving Rebuilt As Child Node [100Xe]c1ccc([101Xe])cc1 ['O[100Xe]', '[101Xe]c1ccccc1'] O[Xe].[Xe]c1ccccc1 O.c1ccccc1 [101Xe]c1ccccc1' ['O[100Xe]', Oc1ccc([Xe])cc1 Oc1ccccc1 '[100Xe]c1ccc([101Xe])cc1'] O[100Xe] ['[101Xe]c1ccccc1', '[100Xe]c1ccc([101Xe])cc1'] [Xe]c1ccc(cc1)c2ccccc2 c1ccc(cc1)c2ccccc2 Table S1. The set of child nodes created from components of 4-hydroxy-biphenyl An edge is created to join each child node to the parent. Each edge is labelled with a set of attributes relating to the parent and child nodes. For example, the edge that links 4-hydroxybiphenyl to biphenyl has the following attributes the type of the excluded component: 'FG' (we used to refer to substituents as functional groups) the type of the rebuilt combination: 'RING' (unused) the nonisomeric smiles of the excluded component: O[Xe] S2
3 the nonisomeric smiles of the rebuilt molecule: [Xe]c1ccc(cc1)c2ccccc2 the simplified graph of the excluded component: O[Xe] the simplified graph of the rebuilt molecule: [Xe]C1CCC(CC1)C2CCCCC2 As with the node attribute, the simplified graph representation for the edge uses the daylight function dt_molgraph to set all bond orders to one, remove aromaticity, set the hydrogen count, remove charges and set masses to zero. Additionally we set every ring atom element to carbon. This means that our simplified graph will be the same for all rings of the same size, irrespective of the aromaticity or the heteroatomic composition of the original component. We can also define equivalencies between rings of different type using our simplified graph representation. This allows us to treat eg a 1,4 substituted 6 membered ring as equivalent to a 1,3 substituted 5 membered ring. These equivalencies are derived from a set of common ring scaffolds in the Astex registry and ChEMBL. When we search the graph, if the excluded component is marked as type RING, we can use our ring equivalence dictionary to match rings of different sizes. For compounds with more than two substituents, the order of the substituents in the ring equivalence dictionary is relevant. For this reason, we add canonicalized isotopic labels to the simplified graph. The canonicalization uses the lexicographical ordering of the smiles pattern for the excluded components. Figure S1 illustrates this. All compounds have common substituents, methyl, fluoro and chloro. Removing the ring system from compounds a-d means that all compounds will have an edge that joins to the node C.Cl.F. Compounds a and b share the same simplified graph for the excluded ring component and can be grouped as replace ring, equivalent vectors. Compound c has a different arrangement of substituents and will have a different simplified graph. This means compound c will be placed in the replace ring, different vectors group. Compound d has a different ring system but the same arrangement of substituents. We can therefore add an equivalency rule between the simplified graph of ring d and the simplified graph of rings a and b that will allow compound d to also be classified as replace ring, equivalent vectors. (a) (b) (c) (d) Figure S1. Related compounds with equivalent and non-equivalent simplified graphs Once the edge data has been recorded, each new node is itself used as input for the algorithm to recursively create additional nodes and edges. Note that the algorithm keeps track of the set of nodes that have been generated. If a node is already in the set the algorithm returns immediately without recomputing child nodes and edges (since these will have been generated in a prior iteration). This provides a significant reduction in the amount of work required to build nodes and edges for a large database. We apply additional rules to deal with the special case of ring-ring bonds and spiro systems. For each ring ring bond in the compound, we add xenon to the bond atoms and delete the bond. For 4- hydroxy-biphenyl, we generate Oc1ccc([Xe])cc1.[Xe]c1ccccc1 S3
4 The xenon atoms are replaced with hydrogen to give the child node Oc1ccccc1.c1ccccc1 An edge between this child node and the parent is added with the following attributes: the type of the excluded component: 'FG' the type of the rebuilt combination: 'RING' (unused) the nonisomeric smiles of the excluded component: [Xe] (indicates a zero length linker) the nonisomeric smiles of the rebuilt molecule: Oc1ccccc1.c1ccccc1 the simplified graph of the excluded component: [Xe] the simplified graph of the rebuilt molecule: OC1CCCCC1.C1CCCCC1 The child node is then passed into the node and edge generating algorithm. A similar approach is used to break spiro ring systems. One can envisage how the algorithm might be extended to deconstruct fused ring systems, although we have not chosen to do so in this implementation. Another suggested enhancement is to treat cyclopropyl rings as both a ring and a substituent. By modifying the smarts pattern that finds ring bonds we could exclude cuts at a cyclopropyl ring. An attribute label will be added to the Oc1ccc(cc1)c2ccccc2 node, to indicate that this compound has the identifier in the EM (emolecules) database. Additional attributes are added to this node, for example a label to mark this node as ChEMBL record ATTR Oc1ccc(cc1)c2ccccc2 EM ATTR Oc1ccc(cc1)c2ccccc2 CHEMBL A complete set of nodes and edges and attributes for 4-hydroxy-biphenyl is listed below. NODE Oc1ccc(cc1)c2ccccc OC1CCC(CC1)C2CCCCC2 0 NODE O.c1ccccc1 7 6 O.C1CCCCC1 1 NODE O 1 0 O 2 EDGE O.c1ccccc1 O RING c1ccccc1 C1CCCCC1 FG O O NODE c1ccccc1 6 6 C1CCCCC1 2 EDGE O.c1ccccc1 c1ccccc1 FG O O RING c1ccccc1 C1CCCCC1 EDGE Oc1ccc(cc1)c2ccccc2 O.c1ccccc1 RING [Xe]c1ccc([Xe])cc1 [100Xe]C1CCC([101Xe])CC1 RING O[Xe].[Xe]c1ccccc1 O[100Xe].[Xe]C1CCCCC1 NODE Oc1ccccc1 7 6 OC1CCCCC1 1 EDGE Oc1ccccc1 O RING [Xe]c1ccccc1 [100Xe]C1CCCCC1 FG O[Xe] O[Xe] EDGE Oc1ccccc1 c1ccccc1 FG O[Xe] O[Xe] RING [Xe]c1ccccc1 [100Xe]C1CCCCC1 EDGE Oc1ccc(cc1)c2ccccc2 Oc1ccccc1 RING [Xe]c1ccccc1 [100Xe]C1CCCCC1 RING Oc1ccc([Xe])cc1 OC1CCC([Xe])CC1 NODE c1ccc(cc1)c2ccccc C1CCC(CC1)C2CCCCC2 1 EDGE c1ccc(cc1)c2ccccc2 c1ccccc1 FG [Xe] [Xe] RING [Xe]c1ccccc1 [100Xe]C1CCCCC1 EDGE c1ccc(cc1)c2ccccc2 c1ccccc1 FG [Xe] [Xe] RING [Xe]c1ccccc1 [100Xe]C1CCCCC1 NODE c1ccccc1.c1ccccc C1CCCCC1.C1CCCCC1 2 EDGE c1ccccc1.c1ccccc1 c1ccccc1 FG [Xe] [Xe] RING c1ccccc1 C1CCCCC1 EDGE c1ccccc1.c1ccccc1 c1ccccc1 FG [Xe] [Xe] RING c1ccccc1 C1CCCCC1 EDGE c1ccc(cc1)c2ccccc2 c1ccccc1.c1ccccc1 FG [Xe] [Xe] RING c1ccccc1.c1ccccc1 C1CCCCC1.C1CCCCC1 EDGE Oc1ccc(cc1)c2ccccc2 c1ccc(cc1)c2ccccc2 FG O[Xe] O[Xe] RING [Xe]c1ccc(cc1)c2ccccc2 [100Xe]C1CCC(CC1)C2CCCCC2 NODE Oc1ccccc1.c1ccccc OC1CCCCC1.C1CCCCC1 1 EDGE Oc1ccccc1.c1ccccc1 Oc1ccccc1 RING c1ccccc1 C1CCCCC1 RING Oc1ccccc1 OC1CCCCC1 EDGE Oc1ccccc1.c1ccccc1 O.c1ccccc1 RING [Xe]c1ccccc1 [100Xe]C1CCCCC1 RING O[Xe].c1ccccc1 O[100Xe].C1CCCCC1 EDGE Oc1ccccc1.c1ccccc1 c1ccccc1.c1ccccc1 FG O[Xe] O[Xe] RING [Xe]c1ccccc1.c1ccccc1 [100Xe]C1CCCCC1.C1CCCCC1 S4
5 EDGE Oc1ccc(cc1)c2ccccc2 Oc1ccccc1.c1ccccc1 FG [Xe] [Xe] RING Oc1ccccc1.c1ccccc1 OC1CCCCC1.C1CCCCC1 ATTR Oc1ccc(cc1)c2ccccc2 EM Loading Data into Neo4j A set of nodes, edges and attributes for the compounds that are listed as results in Figure 2 can be found in the supporting information files jm7b00809_si_002.txt (nodes), jm7b00809_si_003.txt (edges), jm7b00809_si_004.txt (attributes). These data files can be loaded into a Neo4j graph database (our implementation uses Neo4j vesion 2.1.6). Neo4j has a SQL like syntax called cypher. The following cypher commands can be used to load the nodes, edges and attribute into the graph database. USING PERIODIC COMMIT LOAD CSV FROM 'file:///jm7b00809_si_002.txt ' AS line FIELDTERMINATOR ' ' MERGE (:F2 { smiles: line[1], hac: toint(line[2]), chac: toint(line[3]), osmiles: line[4]}); USING PERIODIC COMMIT LOAD CSV FROM 'file:///jm7b00809_si_003.txt ' AS line FIELDTERMINATOR ' ' MATCH (n1:f2 { smiles: line[1]}), (n2:f2 { smiles: line[2]}) MERGE (n1)-[:f2edge{label:line[3]}]- >(n2); USING PERIODIC COMMIT LOAD CSV FROM 'file:///jm7b00809_si_004.txt ' AS line FIELDTERMINATOR ' ' MATCH (n:f2 { smiles: line[1]} ) set n:mol, n:em, n.em=toint(line[3]); Querying Neo4j The cypher query used to find and categorise 'medium' paths of length one from 4-hydroxy-biphenyl to a commercially available compound is match p = (n:f2{smiles:'oc1ccc(cc1)c2ccccc2'})-[nm]-(m:em) where abs(n.hac-m.hac) <= 3 and abs(n.chac-m.chac) <= 1 return split(nm.label, ' ')[4], split(nm.label, ' ')[1], nm.label, m.hac-n.hac, m.smiles, m.em order by split(nm.label, ' ')[4]; This query is somewhat complicated by the decision to store edge metadata as a single attribute. The metadata is pipe separated and needs to be split to construct the query. Anyone wishing to implement a similar network might choose to store each piece of edge metadata as a separate attribute. Our current implementation splits and groups the paths using python code after the paths are returned from Neo4j - the cypher query is for information only. By grouping the matches on the change in atom count and selected attributes of the edge label, we can classify the results as deletions or additions at a specific position. To sort these groups, we use the values from a lookup dictionary that is keyed on the substituent attribute of the edge label. Any substituent that is not in the dictionary will be given a weight of zero. Any ties are broken using the the heavy atom count and the lexicographical sort order of the smiles string. A similar cypher query can be used to find and categorise medium paths of length two between 4- hydroxy-biphenyl and a commercially available compound MATCH (sta:f2 {smiles:"oc1ccc(cc1)c2ccccc2"})-[n4:f2edge]-(n3:f2)-[n2:f2edge]-(end:em) where S5
6 abs(sta.hac-end.hac) <= 3 and abs(sta.chac-end.chac) <= 1 and sta.smiles <> end.smiles RETURN split(n4.label, ' ')[4], split(n4.label, ' ')[2], split(n2.label, ' ')[2], split(n2.label, ' ')[1], end.em, end.smiles order by split(n4.label, ' ')[4], split(n2.label, ' ')[2]; The first column in the output can be used to group similar transformations. The second and third columns can be used to compare equivalencies between simplified graphs. The fourth column lists the replacement. The fifth and sixth columns are the identifier and the smiles string of the related compound. Again the query is for illustration purposes; our implementation splits the edge metadata after the query results are returned. S6
ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009
ICM-Chemist How-To Guide Version 3.6-1g Last Updated 12/01/2009 ICM-Chemist HOW TO IMPORT, SKETCH AND EDIT CHEMICALS How to access the ICM Molecular Editor. 1. Click here 2. Start sketching How to sketch
More informationDECEMBER 2014 REAXYS R201 ADVANCED STRUCTURE SEARCHING
DECEMBER 2014 REAXYS R201 ADVANCED STRUCTURE SEARCHING 1 NOTES ON REAXYS R201 THIS PRESENTATION COMMENTS AND SUMMARY Outlines how to: a. Perform Substructure and Similarity searches b. Use the functions
More informationTable of Contents. Scope of the Database 3 Searching by Structure 3. Searching by Substructure 4. Searching by Text 11
Searrcchiing fforr Subssttanccess and Reaccttiionss iin Beiillsstteiin and Gmelliin 1 Table of Contents Scope of the Database 3 Searching by Structure 3 Introduction to the Structure Editor 3 Searching
More informationAliphatic Hydrocarbons Anthracite alkanes arene alkenes aromatic compounds alkyl group asymmetric carbon Alkynes benzene 1a
Aliphatic Hydrocarbons Anthracite alkanes arene alkenes aromatic compounds alkyl group asymmetric carbon Alkynes benzene 1a Hard coal, which is high in carbon content any straight-chain or branched-chain
More informationChuck Cartledge, PhD. 21 January 2018
Big Data: Data Analysis Boot Camp Non-SQL and R Chuck Cartledge, PhD 21 January 2018 1/19 Table of contents (1 of 1) 1 Intro. 2 Non-SQL DBMS Classic Non-SQL databases 3 Hands-on Airport connections as
More informationORGANIC CHEMISTRY. Classification of organic compounds
ORGANIC CHEMISTRY Organic chemistry is very important branch of chemistry and it study the compounds which contain carbon (C) and hydrogen (H), in general, and may contains other atoms such as oxygen (O),
More informationfile:///biology Exploring Life/BiologyExploringLife04/
Objectives Identify carbon skeletons and functional groups in organic molecules. Relate monomers and polymers. Describe the processes of building and breaking polymers. Key Terms organic molecule inorganic
More informationAssignment 1: Molecular Mechanics (PART 1 25 points)
Chemistry 380.37 Fall 2015 Dr. Jean M. Standard August 19, 2015 Assignment 1: Molecular Mechanics (PART 1 25 points) In this assignment, you will perform some molecular mechanics calculations using the
More informationRepresentation of molecular structures. Coutersy of Prof. João Aires-de-Sousa, University of Lisbon, Portugal
Representation of molecular structures Coutersy of Prof. João Aires-de-Sousa, University of Lisbon, Portugal A hierarchy of structure representations Name (S)-Tryptophan 2D Structure 3D Structure Molecular
More informationDictionary of ligands
Dictionary of ligands Some of the web and other resources Small molecules DrugBank: http://www.drugbank.ca/ ZINC: http://zinc.docking.org/index.shtml PRODRUG: http://www.compbio.dundee.ac.uk/web_servers/prodrg_down.html
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationPOC via CHEMnetBASE for Identifying Unknowns
Table of Contents A red arrow is used to identify where buttons and functions are located in CHEMnetBASE. Figure Description Page Entering the Properties of Organic Compounds (POC) Database 1 CHEMnetBASE
More informationAssigning Unique Keys to Chemical Compounds for Data Integration: Some Interesting Counter Examples
Assigning Unique Keys to Chemical Compounds for Data Integration: Some Interesting Counter Examples Greeshma Neglur 1,RobertL.Grossman 2, and Bing Liu 3 1 Laboratory for Advanced Computing, University
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationPOC via CHEMnetBASE for Identifying Unknowns
Table of Contents A red arrow was used to identify where buttons and functions are located in CHEMnetBASE. Figure Description Page Entering the Properties of Organic Compounds (POC) Database 1 Swain Home
More informationCanonical Line Notations
Canonical Line otations InChI vs SMILES Krisztina Boda verview Compound naming InChI SMILES Molecular equivalency Isomorphism Kekule Tautomers Finding duplicates What s Your ame? 1. Unique numbers CAS
More informationNaming Organic Compounds: Alkanes
Naming Organic Compounds: Alkanes Chemical nomenclature assigns compounds a unique name that allows them to be easily identified and structurally understood. The International Union of Pure and Applied
More informationThe Schrödinger KNIME extensions
The Schrödinger KNIME extensions Computational Chemistry and Cheminformatics in a workflow environment Jean-Christophe Mozziconacci Volker Eyrich Topics What are the Schrödinger extensions? Workflow application
More informationUnderstanding ATP Activity
Name: Period: Understanding ATP Activity Background & Objectives: Energy within a cell exists in the form of chemical energy. A source of this chemical energy is a compound called adenosine triphosphate
More informationGeodatabase Programming with Python John Yaist
Geodatabase Programming with Python John Yaist DevSummit DC February 26, 2016 Washington, DC Target Audience: Assumptions Basic knowledge of Python Basic knowledge of Enterprise Geodatabase and workflows
More informationNaming and Drawing Carboxylic Acids
Assignment 4 Task 5 Due: 11:59pm on Friday, October 5, 2018 You will receive no credit for items you complete after the assignment is due. Grading Policy Naming and Drawing Carboxylic Acids Aromatic carboxylic
More informationChemAxon. Content. By György Pirok. D Standardization D Virtual Reactions. D Fragmentation. ChemAxon European UGM Visegrad 2008
Transformers f off ChemAxon By György Pirok Content Standardization Virtual Reactions Metabolism M b li P Prediction di i Fragmentation 2 1 Standardization http://www.chemaxon.com/jchem/doc/user/standardizer.html
More informationChapter 20. Mass Spectroscopy
Chapter 20 Mass Spectroscopy Mass Spectrometry (MS) Mass spectrometry is a technique used for measuring the molecular weight and determining the molecular formula of an organic compound. Mass Spectrometry
More informationGeodatabase Programming with Python
DevSummit DC February 11, 2015 Washington, DC Geodatabase Programming with Python Craig Gillgrass Assumptions Basic knowledge of python Basic knowledge enterprise geodatabases and workflows Please turn
More informationCommand-line tools of ChemAxon: tips and tricks
Command-line tools of ChemAxon: tips and tricks György Pirok Solutions for Cheminformatics Command-line interface A command-line interface (CLI) is a mechanism for interacting with a computer operating
More informationSearching Substances in Reaxys
Searching Substances in Reaxys Learning Objectives Understand that substances in Reaxys have different sources (e.g., Reaxys, PubChem) and can be found in Document, Reaction and Substance Records Recognize
More informationPipeline Pilot Integration
Scientific & technical Presentation Pipeline Pilot Integration Szilárd Dóránt July 2009 The Component Collection: Quick facts Provides access to ChemAxon tools from Pipeline Pilot Free of charge Open source
More informationBasic Techniques in Structure and Substructure
Truncating Molecules Basic Techniques in Structure and Substructure Searching for Information Professionals Judith Currano Head, Chemistry Library University of Pennsylvania currano@pobox.upenn.edu Acknowledgements
More informationFAMILIES of ORGANIC COMPOUNDS
1 SCH4U October 2016 Organic Chemistry Chemistry of compounds that contain carbon (except: CO, CO 2, HCN, CO 3 - ) Carbon is covalently bonded to another carbon, hydrogen and possibly to oxygen, a halogen
More informationIntroduction to Spark
1 As you become familiar or continue to explore the Cresset technology and software applications, we encourage you to look through the user manual. This is accessible from the Help menu. However, don t
More informationReaxys Pipeline Pilot Components Installation and User Guide
1 1 Reaxys Pipeline Pilot components for Pipeline Pilot 9.5 Reaxys Pipeline Pilot Components Installation and User Guide Version 1.0 2 Introduction The Reaxys and Reaxys Medicinal Chemistry Application
More informationOAT Organic Chemistry - Problem Drill 19: NMR Spectroscopy and Mass Spectrometry
OAT Organic Chemistry - Problem Drill 19: NMR Spectroscopy and Mass Spectrometry Question No. 1 of 10 Question 1. Which statement concerning NMR spectroscopy is incorrect? Question #01 (A) Only nuclei
More informationAdministering your Enterprise Geodatabase using Python. Jill Penney
Administering your Enterprise Geodatabase using Python Jill Penney Assumptions Basic knowledge of python Basic knowledge enterprise geodatabases and workflows You want code Please turn off or silence cell
More informationName Date Class HYDROCARBONS
22.1 HYDROCARBONS Section Review Objectives Describe the relationship between number of valence electrons and bonding in carbon Define and describe alkanes Relate the polarity of hydrocarbons to their
More informationSimilarity Search. Uwe Koch
Similarity Search Uwe Koch Similarity Search The similar property principle: strurally similar molecules tend to have similar properties. However, structure property discontinuities occur frequently. Relevance
More informationAdvanced Implementations of Tables: Balanced Search Trees and Hashing
Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the
More informationBIOLOGY 101. CHAPTER 4: Carbon and the Molecular Diversity of Life: Carbon: the Backbone of Life
BIOLOGY 101 CHAPTER 4: Carbon and the Molecular Diversity of Life: CONCEPTS: 4.1 Organic chemistry is the study of carbon compounds 4.2 Carbon atoms can form diverse molecules by bonding to four other
More informationC. Correct! The abbreviation Ar stands for an aromatic ring, sometimes called an aryl ring.
Organic Chemistry - Problem Drill 05: Drawing Organic Structures No. 1 of 10 1. What does the abbreviation Ar stand for? (A) Acetyl group (B) Benzyl group (C) Aromatic or Aryl group (D) Benzoyl group (E)
More informationOrganometallics & InChI. August 2017
Organometallics & InChI August 2017 The Cambridge Structural Database 900,000+ small-molecule crystal structures Over 60,000 datasets deposited annually Enriched and annotated by experts Structures available
More informationOECD QSAR Toolbox v.4.1. Tutorial illustrating new options of the structure similarity
OECD QSAR Toolbox v.4.1 Tutorial illustrating new options of the structure similarity Outlook Background Aims PubChem features The exercise Workflow 2 Background This presentation is designed to familiarize
More information4. NMR spectra. Interpreting NMR spectra. Low-resolution NMR spectra. There are two kinds: Low-resolution NMR spectra. High-resolution NMR spectra
1 Interpreting NMR spectra There are two kinds: Low-resolution NMR spectra High-resolution NMR spectra In both cases the horizontal scale is labelled in terms of chemical shift, δ, and increases from right
More informationSmallWorld: Efficient Maximum Common Subgraph Searching of Large Chemical Databases
SmallWorld: Efficient Maximum Common Subgraph Searching of Large Chemical Databases Roger Sayle, Jose Batista and Andrew Grant NextMove Software, Cambridge, UK AstraZeneca R&D, Alderley Park, UK 2d chemical
More informationAnswers to Problem Set #2
hem 242 Spring 2008 Answers to Problem Set #2 1. For this question we have been given the molecular formula, 3 5 l. Looking at the IR, the strong signal at 1720 cm 1 tells us that we have a carbonyl (we
More informationFrequent Pattern Mining: Exercises
Frequent Pattern Mining: Exercises Christian Borgelt School of Computer Science tto-von-guericke-university of Magdeburg Universitätsplatz 2, 39106 Magdeburg, Germany christian@borgelt.net http://www.borgelt.net/
More informationOrganic Chemistry II KEY March 25, a) I only b) II only c) II & III d) III & IV e) I, II, III & IV
rganic Chemistry II KEY March 25, 2015 Exam 2: VERSIN A 1. Which of the following compounds will give rise to an aromatic conjugate base? E a) I only b) II only c) II & III d) III & IV e) I, II, III &
More informationMEDICINAL CHEMISTRY I EXAM #1
MEDICIAL CEMISTRY I EXAM #1 1 September 30, 2005 ame SECTI A. Answer each question in this section by writing the letter corresponding to the best answer on the line provided (2 points each; 50 points
More informationWhat are the building blocks of life?
Why? What are the building blocks of life? From the smallest single-celled organism to the tallest tree, all life depends on the properties and reactions of four classes of organic (carbon-based) compounds
More informationUnsaturated hydrocarbons. Chapter 13
Unsaturated hydrocarbons Chapter 13 Unsaturated hydrocarbons Hydrocarbons which contain at least one C-C multiple (double or triple) bond. The multiple bond is a site for chemical reactions in these molecules.
More information2/25/2015. Chapter 4. Introduction to Organic Compounds. Outline. Lecture Presentation. 4.1 Alkanes: The Simplest Organic Compounds
Lecture Presentation Outline Chapter 4 Introduction to Organic Compounds 4.2 Representing Structures of Organic Compounds Julie Klare Fortis College Smyrna, GA Alkanes are structurally simple organic compounds
More informationData Mining in the Chemical Industry. Overview of presentation
Data Mining in the Chemical Industry Glenn J. Myatt, Ph.D. Partner, Myatt & Johnson, Inc. glenn.myatt@gmail.com verview of presentation verview of the chemical industry Example of the pharmaceutical industry
More information(b) How many hydrogen atoms are in the molecular formula of compound A? [Consider the 1 H NMR]
CHEM 6371/4511 Name: The exam consists of interpretation of spectral data for compounds A-C. The analysis of each structure is worth 33.33 points. Compound A (a) How many carbon atoms are in the molecular
More informationFast similarity searching making the virtual real. Stephen Pickett, GSK
Fast similarity searching making the virtual real Stephen Pickett, GSK Introduction Introduction to similarity searching Use cases Why is speed so crucial? Why MadFast? Some performance stats Implementation
More informationInteractive Feature Selection with
Chapter 6 Interactive Feature Selection with TotalBoost g ν We saw in the experimental section that the generalization performance of the corrective and totally corrective boosting algorithms is comparable.
More informationChemical Databases: Encoding, Storage and Search of Chemical Structures
Chemical Databases: Encoding, Storage and Search of Chemical Structures Dr. Timur I. Madzhidov Kazan Federal University, Department of Organic Chemistry * Ray, L.C. and R.A. Kirsch, Finding Chemical Records
More informationIntroduction to Chemoinformatics
Introduction to Chemoinformatics www.dq.fct.unl.pt/cadeiras/qc Prof. João Aires-de-Sousa Email: jas@fct.unl.pt Recommended reading Chemoinformatics - A Textbook, Johann Gasteiger and Thomas Engel, Wiley-VCH
More informationQuiz 1 Solutions. (a) f 1 (n) = 8 n, f 2 (n) = , f 3 (n) = ( 3) lg n. f 2 (n), f 1 (n), f 3 (n) Solution: (b)
Introduction to Algorithms October 14, 2009 Massachusetts Institute of Technology 6.006 Spring 2009 Professors Srini Devadas and Constantinos (Costis) Daskalakis Quiz 1 Solutions Quiz 1 Solutions Problem
More informationChemistry 20 Chapters 2 Alkanes
Chemistry 20 Chapters 2 Alkanes ydrocarbons: a large family of organic compounds and they contain only carbon and hydrogen. ydrocarbons are divided into two groups: 1. Saturated hydrocarbon: a hydrocarbon
More informationInformation Extraction from Chemical Images. Discovery Knowledge & Informatics April 24 th, Dr. Marc Zimmermann
Information Extraction from Chemical Images Discovery Knowledge & Informatics April 24 th, 2006 Dr. Available Chemical Information Textbooks Reports Patents Databases Scientific journals and publications
More informationUnit 5: Organic Chemistry
Unit 5: Organic Chemistry Organic chemistry: discipline in chemistry focussing strictly on the study of hydrocarbons compounds made up of carbon & hydrogen Organic compounds can contain other elements
More informationCHM Salicylic Acid Properties (r16) 1/11
CHM 111 - Salicylic Acid Properties (r16) 1/11 Purpose In this lab, you will perform several tests to attempt to confirm the identity and assess the purity of the substance you synthesized in last week's
More informationCSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions
CSE 502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions 1. Consider the following algorithm: for i := 1 to α n log e n do Pick a random j [1, n]; If a[j] = a[j + 1] or a[j] = a[j 1] then output:
More informationChapter 25: The Chemistry of Life: Organic and Biological Chemistry
Chemistry: The Central Science Chapter 25: The Chemistry of Life: Organic and Biological Chemistry The study of carbon compounds constitutes a separate branch of chemistry known as organic chemistry The
More informationCHAPTER 2: Structure and Properties of Organic Molecules
1 HAPTER 2: Structure and Properties of Organic Molecules Atomic Orbitals A. What are atomic orbitals? Atomic orbitals are defined by special mathematical functions called wavefunctions-- (x, y, z). Wavefunction,
More informationSupplementary Material
Supplementary Material Contents 1 Keywords of GQL 2 2 The GQL grammar 3 3 THE GQL user guide 4 3.1 The environment........................................... 4 3.2 GQL projects.............................................
More information2. (10 points) Consider the following algorithm performed on a sequence of numbers a 1, a 2,..., a n.
1. (22 points) Below, a number is any string of digits that does not begin with a zero. (a) (2 points) How many 6-digit numbers are there? We may select the first digit in any of 9 ways (any digit from
More informationPaper 12: Organic Spectroscopy
Subject Chemistry Paper No and Title Module No and Title Module Tag Paper 12: Organic Spectroscopy 31: Combined problem on UV, IR, 1 H NMR, 13 C NMR and Mass - Part III CHE_P12_M31 TABLE OF CONTENTS 1.
More informationHow to add your reactions to generate a Chemistry Space in KNIME
How to add your reactions to generate a Chemistry Space in KNIME Introduction to CoLibri This tutorial is supposed to show how normal drawings of reactions can be easily edited to yield precise reaction
More informationAlkanes and Cycloalkanes
Alkanes and Cycloalkanes Families of Organic Compounds Organic compounds can be grouped into families by their common structural features We shall survey the nature of the compounds in a tour of the families
More informationBIOB111 - Tutorial activities for session 8
BIOB111 - Tutorial activities for session 8 General topics for week 4 Session 8 Physical and chemical properties and examples of these functional groups (methyl, ethyl in the alkyl family, alkenes and
More informationAS Demonstrate understanding of the properties of selected organic compounds. Collated Polymer questions
AS 91165 Demonstrate understanding of the properties of selected organic compounds Collated Polymer questions (2017) (a) Polyvinyl chloride (polychloroethene) is often used to make artificial leather.
More informationChemical Ontologies. Chemical Ontologies. ChemAxon UGM May 23, 2012
Chemical Ontologies ChemAxon UGM May 23, 2012 Chemical Ontologies OntoChem GmbH Heinrich-Damerow-Str. 4 06120 Halle (Saale) Germany Tel. +49 345 4780472 Fax: +49 345 4780471 mail: info(at)ontochem.com
More information(Refer Slide Time: 0:37)
Principles and Applications of NMR spectroscopy Professor Hanudatta S. Atreya NMR Research Centre Indian Institute of Science Bangalore Module 3 Lecture No 14 We will start today with spectral analysis.
More informationChapter 2 The text above the third display should say Three other examples.
ERRATA Organic Chemistry, 6th Edition, by Marc Loudon Date of this release: October 10, 2018 (Items marked with (*) were corrected in the second printing.) (Items marked with ( ) were corrected in the
More information2. Atomic Structure and Periodic Table Details of the three Sub-atomic (fundamental) Particles
2. Atomic Structure and Periodic Table Details of the three Sub-atomic (fundamental) Particles Particle Position Relative Mass Relative Charge Proton Nucleus 1 +1 Neutron Nucleus 1 Electron Orbitals 1/184-1
More informationA powerful site for all chemists CHOICE CRC Handbook of Chemistry and Physics
Chemical Databases Online A powerful site for all chemists CHOICE CRC Handbook of Chemistry and Physics Combined Chemical Dictionary Dictionary of Natural Products Dictionary of Organic Dictionary of Drugs
More informationCHEMDRAW ULTRA ITEC107 - Introduction to Computing for Pharmacy. ITEC107 - Introduction to Computing for Pharmacy 1
CHEMDRAW ULTRA 12.0 ITEC107 - Introduction to Computing for Pharmacy 1 Objectives Basic drawing skills with ChemDraw Bonds, captions, hotkeys, chains, arrows Checking and cleaning up structures Chemical
More informationExercises for Windows
Exercises for Windows CAChe User Interface for Windows Select tool Application window Document window (workspace) Style bar Tool palette Select entire molecule Select Similar Group Select Atom tool Rotate
More informationImago: open-source toolkit for 2D chemical structure image recognition
Imago: open-source toolkit for 2D chemical structure image recognition Viktor Smolov *, Fedor Zentsev and Mikhail Rybalkin GGA Software Services LLC Abstract Different chemical databases contain molecule
More informationENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October
ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October Contents 1 Cartesian Tree. Definition. 1 2 Cartesian Tree. Construction 1 3 Cartesian Tree. Operations. 2 3.1 Split............................................
More informationNames. Chiral: A chiral object is not superimposable upon its mirror image. A chiral object contains the property of "handedness.
CEM 241 IN-CLASS #3 MOLECULAR MODELS EXERCISE Names Stereoisomerism Construct a model containing a tetrahedral carbon (black ball) that is attached to four different atoms (use the green, orange, purple
More informationantidisestablishmenttarianism an-ti-dis-es-tab-lish-ment-ta-ri-an-ism
What do you do when you encounter a very long, difficult word? 1 antidisestablishmenttarianism break it up into syllables: an-ti-dis-es-tab-lish-ment-ta-ri-an-ism meaning: antidisestablishmenttarianism
More informationChapter 15 Molecular Luminescence Spectrometry
Chapter 15 Molecular Luminescence Spectrometry Two types of Luminescence methods are: 1) Photoluminescence, Light is directed onto a sample, where it is absorbed and imparts excess energy into the material
More informationAlkanes. Introduction
Introduction Alkanes Recall that alkanes are aliphatic hydrocarbons having C C and C H bonds. They can be categorized as acyclic or cyclic. Acyclic alkanes have the molecular formula C n H 2n+2 (where
More informationMarvin. Sketching, viewing and predicting properties with Marvin - features, tips and tricks. Gyorgy Pirok. Solutions for Cheminformatics
Marvin Sketching, viewing and predicting properties with Marvin - features, tips and tricks Gyorgy Pirok Solutions for Cheminformatics The Marvin family The Marvin toolkit provides web-enabled components
More informationMarch 08 Dr. Abdullah Saleh
March 08 Dr. Abdullah Saleh 1 Effects of Substituents on Reactivity and Orientation The nature of groups already on an aromatic ring affect both the reactivity and orientation of future substitution Activating
More information6.1.1 Aromatic Compounds
6.1.1 Aromatic ompounds There are two major classes of organic chemicals aliphatic : straight or branched chain organic substances aromatic or arene: includes one or more ring of six carbon ams with delocalised
More informationOpen PHACTS Explorer: Compound by Name
Open PHACTS Explorer: Compound by Name This document is a tutorial for obtaining compound information in Open PHACTS Explorer (explorer.openphacts.org). Features: One-click access to integrated compound
More informationPolymerization Modeling
www.optience.com Polymerization Modeling Objective: Modeling Condensation Polymerization by using Functional Groups In this example, we develop a kinetic model for condensation polymerization that tracks
More informationAlkanes and Cycloalkanes
Chapter 3 Alkanes and Cycloalkanes Two types Saturated hydrocarbons Unsaturated hydrocarbons 3.1 Alkanes Also referred as aliphatic hydrocarbons General formula: CnH2n+2 (straight chain) and CnH2n (cyclic)
More informationIdentification of functional groups in the unknown Will take in lab today
Qualitative Analysis of Unknown Compounds 1. Infrared Spectroscopy Identification of functional groups in the unknown Will take in lab today 2. Elemental Analysis Determination of the Empirical Formula
More informationAlkanes and Cycloalkanes
Alkanes and Cycloalkanes Alkanes molecules consisting of carbons and hydrogens in the following ratio: C n H 2n+2 Therefore, an alkane having 4 carbons would have 2(4) + 2 hydrogens, which equals 10 hydrogens.
More informationData Structures and Algorithms " Search Trees!!
Data Structures and Algorithms " Search Trees!! Outline" Binary Search Trees! AVL Trees! (2,4) Trees! 2 Binary Search Trees! "! < 6 2 > 1 4 = 8 9 Ordered Dictionaries" Keys are assumed to come from a total
More information12.1 The Nature of Organic molecules
12.1 The Nature of Organic molecules Organic chemistry: : The chemistry of carbon compounds. Carbon is tetravalent; it always form four bonds. Prentice Hall 2003 Chapter One 2 Organic molecules have covalent
More informationAtomic weight = Number of protons + neutrons
1 BIOLOGY Elements and Compounds Element is a substance that cannot be broken down to other substances by chemical reactions. Essential elements are chemical elements required for an organism to survive,
More informationThe now-banned diet drug fen-phen is a mixture of two synthetic substituted benzene: fenfluramine and phentermine.
The now-banned diet drug fen-phen is a mixture of two synthetic substituted benzene: fenfluramine and phentermine. Chemists have synthesized compounds with structures similar to adrenaline, producing amphetamine.
More information4.1.1 Organic: Basic Concepts
.. rganic: Basic oncepts ydrocarbon is a compound consisting of hydrogen and carbon only Basic definitions to know Saturated: ontain single carbon-carbon bonds only Unsaturated : ontains a = double bond
More information4. Constraints and Hydrogen Atoms
4. Constraints and ydrogen Atoms 4.1 Constraints versus restraints In crystal structure refinement, there is an important distinction between a constraint and a restraint. A constraint is an exact mathematical
More informationTautomerism in chemical information management systems
Tautomerism in chemical information management systems Dr. Wendy A. Warr http://www.warr.com Tautomerism in chemical information management systems Author: Wendy A. Warr DOI: 10.1007/s10822-010-9338-4
More informationOrganic Chemistry. FAMILIES of ORGANIC COMPOUNDS
1 SCH4U September 2017 Organic Chemistry Is the chemistry of compounds that contain carbon (except: CO, CO 2, HCN, CO 3 2- ) Carbon is covalently bonded to another carbon, hydrogen and possibly to oxygen,
More informationRelations. We have seen several types of abstract, mathematical objects, including propositions, predicates, sets, and ordered pairs and tuples.
Relations We have seen several types of abstract, mathematical objects, including propositions, predicates, sets, and ordered pairs and tuples. Relations use ordered tuples to represent relationships among
More information