FROM MOLECULAR FORMULAS TO MARKUSH STRUCTURES

Similar documents
Ákos Tarcsay CHEMAXON SOLUTIONS

Aurora Costache, PhD. CHEMAXON PORTFOLIO WALK THROUGH From toolkits to end-user applications to deliver solutions

Pipeline Pilot Integration

ChemAxon Partner Session: Arxspan Overview

The Case for Use Cases

Pipeline Pilot Integration

Analytical data, the web, and standards for unified laboratory informatics databases

Farewell, PipelinePilot Migrating the Exquiron cheminformatics platform to KNIME and the ChemAxon technology

Cheminformatics Role in Pharmaceutical Industry. Randal Chen Ph.D. Abbott Laboratories Aug. 23, 2004 ACS

Marvin. Sketching, viewing and predicting properties with Marvin - features, tips and tricks. Gyorgy Pirok. Solutions for Cheminformatics

Expanding the scope of literature data with document to structure tools PatentInformatics applications at Aptuit

Integrated Cheminformatics to Guide Drug Discovery

Large Scale Evaluation of Chemical Structure Recognition 4 th Text Mining Symposium in Life Sciences October 10, Dr.

Virtual Libraries and Virtual Screening in Drug Discovery Processes using KNIME

Bentley Map Advancing GIS for the World s Infrastructure

Comprehensive Chemoinformatics since Web-based, client/server, and toolkit approaches. Native Oracle (cartridge) and Microsoft technology.

Rapid Application Development using InforSense Open Workflow and Daylight Technologies Deliver Discovery Value

Information Extraction from Chemical Images. Discovery Knowledge & Informatics April 24 th, Dr. Marc Zimmermann

Introducing a Bioinformatics Similarity Search Solution

Finding the Needle - Reaxys Structure Searching

Data Mining in the Chemical Industry. Overview of presentation

Bentley Map Advancing GIS for the World s Infrastructure

CSD. Unlock value from crystal structure information in the CSD

Large scale classification of chemical reactions from patent data

Chemical Data Retrieval and Management

Demonstration: Searching patents based on chemical structure using SciFinder

DRUG DISCOVERY TODAY ELN ELN. Chemistry. Biology. Known ligands. DBs. Generate chemistry ideas. Check chemical feasibility In-house.

Reaxys Managing Complexity

Reaxys Pipeline Pilot Components Installation and User Guide

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

Bentley Geospatial update

A powerful site for all chemists CHOICE CRC Handbook of Chemistry and Physics

ArcGIS Deployment Pattern. Azlina Mahad

Chemically Intelligent Experiment Data Management

October 6 University Faculty of pharmacy Computer Aided Drug Design Unit

ArcGIS. for Server. Understanding our World

OFWIM 2017 Annual Conference What Does Web GIS Really Mean for Fish and Wildlife Agencies?

The MDL Discovery Framework: Data and Application Integration in the Life Sciences

Structure-based approaches to the indexing and retrieval of patent chemistry. Tim Miller Head of Research May 2010

Leveraging the GIS Capability within FlexiCadastre

Are You Maximizing The Value Of All Your Data?

Reaxys The Highlights

The shortest path to chemistry data and literature

Chemical Journal Publishing in an Online World. Jason Wilde, Publisher Physical Sciences Nature Publishing Group ACS Spring Meeting 2009

Valdosta State University Strategic Research & Analysis

DATA SCIENCE SIMPLIFIED USING ARCGIS API FOR PYTHON

Chemists are from Mars, Biologists from Venus. Originally published 7th November 2006

IUCLID Substance Data

Representation of molecular structures. Coutersy of Prof. João Aires-de-Sousa, University of Lisbon, Portugal

The Schrödinger KNIME extensions

CHAPTER 22 GEOGRAPHIC INFORMATION SYSTEMS

Portals: Standards in Action

Chemical Ontologies. Chemical Ontologies. ChemAxon UGM May 23, 2012

GIS-based Smart Campus System using 3D Modeling

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

Experiences and Directions in National Portals"

Building innovative drug discovery alliances. Just in KNIME: Successful Process Driven Drug Discovery

Innovation. The Push and Pull at ESRI. September Kevin Daugherty Cadastral/Land Records Industry Solutions Manager

These modules are covered with a brief information and practical in ArcGIS Software and open source software also like QGIS, ILWIS.

Handling Human Interpreted Analytical Data. Workflows for Pharmaceutical R&D. Presented by Peter Russell

Regione Umbria. ESRI EMEA User Conference 2010 Rome, October 27th 2010

Application integration: Providing coherent drug discovery solutions

CLRG Biocreative V

Member Level Forecasting Using SAS Enterprise Guide and SAS Forecast Studio

GIS and the Other Enterprise Database

Introduction to ArcGIS Maps for Office. Greg Ponto Scott Ball

The Schrödinger KNIME extensions

Tautomerism in chemical information management systems

Ignasi Belda, PhD CEO. HPC Advisory Council Spain Conference 2015

ChemDataExtractor: A toolkit for automated extraction of chemical information from the scientific literature

Overview. Everywhere. Over everything.

One platform for desktop, web and mobile

You are Building Your Organization s Geographic Knowledge

ESRI Survey Summit August Clint Brown Director of ESRI Software Products

Introduction to Chemoinformatics

Reaxys Medicinal Chemistry Fact Sheet

EEOS 381 -Spatial Databases and GIS Applications

Capturing Chemistry. What you see is what you get In the world of mechanism and chemical transformations

The Schrödinger KNIME extensions

Washington Master Address Services: Project Overview Ben Vaught, OCIO David Wright, DOR Craig Erickson, DOH Tom Kimpel, OFM

Marvin 5.4 A new generation of structure indexing at Elsevier. Dr. Michael Maier, Dr. Heike Nau, Elsevier

Geog 469 GIS Workshop. Managing Enterprise GIS Geodatabases

Portal for ArcGIS: An Introduction. Catherine Hynes and Derek Law

Development of a Web-Based GIS Management System for Agricultural Authorities in Iraq

Geodatabase An Introduction

Solved and Unsolved Problems in Chemoinformatics

The Electronic Representation of Chemical Structures: beyond the low hanging fruit

Хемоінформатика. Докінг. Дизайн ліків. Біоінформатика (3 курс) Лекція 4 (частина 1)

Integrative Metabolic Fate Prediction and Retrosynthetic Analysis the Semantic Way

Spatial Data Management of Bio Regional Assessments Phase 1 for Coal Seam Gas Challenges and Opportunities

Merck Virtual Library (MVL): Deployment, Application, and Future Enhancement

The CUAHSI Hydrologic Information System

European Lead Factory Web Portal Successful operation 3 years on

Leveraging Web GIS: An Introduction to the ArcGIS portal

The Changing Requirements for Informatics Systems During the Growth of a Collaborative Drug Discovery Service Company. Sally Rose BioFocus plc

Esri Overview for Mentor Protégé Program:

Geospatial Information for Disease Prevention and Control. INSPIRE Conference 2013

Patent Searching using Bayesian Statistics

2010 Oracle Corporation 1

Searching Substances in Reaxys

Transcription:

FROM MOLECULAR FORMULAS TO MARKUSH STRUCTURES DIFFERENT LEVELS OF KNOWLEDGE REPRESENTATION IN CHEMISTRY Michael Braden, PhD ACS / San Diego/ 2016

Overview ChemAxon Who are we? Examples/use cases: Create and Report: Marvin Live Store and Search: Biomolecule Toolkit Analyze and Report: Markush and ChemCurator The Big Picture M. Braden / ACS / San Diego / 2016

CHEMAXON Solution-Oriented Cheminformatics

General Chemistry Polymer Chemistry Petrochemistry Agrochemistry Flavors and Fragrances Pharmaceutical and Biotech Industry Drug Discovery Small molecules Biological entities Create, search, store, analyze, and share structural and related data Analytical Chemistry M. Braden / ACS / San Diego / 2016

Vision Reporting Data Integration ChemAxon Platform Collaboration Analysis Integrated Entity and data agnostic Collaborative Fast Accurate Ease of use Contextual M. Braden / ACS / San Diego / 2016

Levels of Chemical Information Representation Constitutional 0/1D bulk properties Topological 2D structures/graphs and fingerprints Geometric 3D structures Energetic response to probing 3D structures M. Braden / ACS / San Diego / 2016

Create Store and Search Analyze Report and Share Design Connect Mining Analysis Marvin Plexus Suite Compound Registration Biomolecule Toolkit JChem for Office Instant JChem Discovery Tools Naming JChem for Sharepoint Marvin Live Marvin JS JChem Base Markush Technology ChemCurator JChem Oracle Cartridge JChem Postgres Cartridge Compliance Checker Oracle PostgreSQL M. Braden / ACS / San Diego / 2016 ELN LIMS Data Marts and Warehouses KNIME

CREATE AND REPORT Marvin Live Molecule Design and Real Time Collaboration

Use Case 1 - Industry Company A - bringing in consultants every week Company B - multiple US East Coast sites, frequent travelling between Company C - global enterprise, facilities on 3 continents Lack of co-location Knowledge silos Duplicate effort M. Braden / ACS / San Diego / 2016

M. Braden / ACS / San Diego / 2016 Data access plugins

M. Braden / ACS / San Diego / 2016 Mining available data

M. Braden / ACS / San Diego / 2016 Explore property space

Use Case 2 - Academia University professor s office hours Tutoring, working on publications M. Braden / ACS / San Diego / 2016

everybody Severity

M. Braden / ACS / San Diego / 2016 Reporting

Reporting - Plugin automatic capture (who, what, when) save important structures for presentations to desktop tools (SD file, SMILES) to ELN, compound registration M. Braden / ACS / San Diego / 2016

STORE AND SEARCH Biomolecule Toolkit Draw, Store, and Search Biomolecules

What is the informatics problem? Cheminformatics Sequence based tools MYRMQLLSCIALSLALVTNSAPTSSSTKKT QLQLEHLLLDLQMILNGINNYKNPKLTRML TFKFYMPKKATELKHLQCLEEELKPLEEVL C[C@@H]1[C@@H]2C[C@]([C@@H]( /C=C/C=C(Cc3cc(c(c(c3)OC)Cl)N (C(=O)C[C@@H]([C@]4([C@H]1O4)C )OC(=O)[C@H](C)N(C)C(=O)CCS)C) \C)OC)(NC(=O)O2)O Representation? Tools? Chemical structure-based tools Bioinformatics M. Braden / ACS / San Diego / 2016

Unmet needs Store the description of a diverse set of entities relevant for R&D in a single data management environment access stored entities with tools common in cheminformatics and bioinformatics interconvert between domain-specific file format standards pass data between collaborators in different domains M. Braden / ACS / San Diego / 2016

Solution: Biomolecule Toolkit Complete, consistent, unambiguous, machine-readable descriptions for any molecular and non-molecular asset in a research and development setting Biomolecules described as sets of attribute value pairs Molecular entities described as combination of chemical building blocks Rules for connecting the building blocks are defined at the atomic level for providing full chemical detail on the resulting Biologic Ambiguity is supported at various levels Ability to complement description with metadata M. Braden / ACS / San Diego / 2016

Use case: Search Sequence-based search Chemical substructure search Keyword search Query Substructure: AND ProjectID = PID2 M. Braden / ACS / San Diego / 2016

Search example: lysine derivatives, 6-Aha, SMCC linker M. Braden / ACS / San Diego / 2016

Search example: lysine derivatives, 6-Aha, SMCC linker M. Braden / ACS / San Diego / 2016

Search example: lysine derivatives, 6-Aha, SMCC linker + ProjectID= PID2 M. Braden / ACS / San Diego / 2016

Architecture Registration Client Plexus Connect JChem for Office Web Service Layer JChem Base / Biomolecule Toolkit API Data Access Layer Instant JChem Registration DB M. Braden / ACS / San Diego / 2016

ANALYZE AND REPORT Markush and ChemCurator Analyze Chemistry in Papers and Patents

Markush Representation R-groups Atom lists Bond lists Position variations Link nodes Repeating units Homology groups UGM / San Diego / 2015

Markush technologies Search Enumeration Hit visualization Non-hit visualization Overlap Composer UGM / San Diego / 2015

Non-hit visualization Differentiating structure parts visualization in target Markush and Query structure. Markush Editor ChemCurator UGM / San Diego / 2015

Markush Overlap Overlapping chemical space calculation API Percentage of overlap Overlapping Markush UGM / San Diego / 2015

Markush Composer API Markush Editor UGM / San Diego / 2015

Computer-assisted chemical data extraction English, Chinese and Japanese N2S Structure Checker Markush Editor Search and representation UGM / San Diego / 2015

General document curation Files (XML, PDF, HTML) Google Patents IFI CLAIMS Images (CLiDE & OSRA) UGM / San Diego / 2015

Compounds extraction view Project explorer Compound list Annotated document Selected structures UGM / San Diego / 2015

Markush extraction view Annotated document Markush editor Structure checker Selected structures Example structures UGM / San Diego / 2015

THE BIG PICTURE

The Big Picture What should we be doing now? How should we be doing it? Where are things going in the future? M. Braden / ACS / San Diego / 2016

THANK YOU Michael Braden, PhD ACS, San Diego, CA March 2016