DERWENT MARKUSH RESOURCE NOW AVAILABLE ON STN. BRIAN LARNER IP & SCIENCE 10 Dec 2015

Similar documents
Comprehensive DWPI SM structure searching using DCR and DWPIM on STN

Demonstration: Searching patents based on chemical structure using SciFinder

How to Create a Substance Answer Set

Searching Substances in Reaxys

Structure-based approaches to the indexing and retrieval of patent chemistry. Tim Miller Head of Research May 2010

DECEMBER 2014 REAXYS R201 ADVANCED STRUCTURE SEARCHING

Table of Contents. Scope of the Database 3 Searching by Structure 3. Searching by Substructure 4. Searching by Text 11

SEARCHING DWPI POLYMER INDEXING ON NEW STN

SEARCHING VALUE ADDED POLYMER INDEXING IN DWPI, DCR & DWPIM ON NEW STN BRIAN LARNER & JIM BROWN OCTOBER 2016

Basic Techniques in Structure and Substructure

Developing CAS Products for Substructure Searching by Chemists. Linda Toler

Reaxys Pipeline Pilot Components Installation and User Guide

Structure Searching in CrossFire Beilstein. DiscoveryGate SM Version 1.4 Participant s Guide

Chem 1075 Chapter 19 Organic Chemistry Lecture Outline

Information Retrieval: SciFinder

The shortest path to chemistry data and literature

Searching CrossFire Beilstein Using DiscoveryGate. DiscoveryGate Version 2.2 Participant s Guide

Exam 1 (Monday, July 6, 2015)

Structure Drawing. March Use the New Features. Keyboard Shortcuts and Paste from ChemDraw

Chapter 21: Hydrocarbons Section 21.3 Alkenes and Alkynes

Searching Inorganic Chemistry

Introducing the New SciFinder. Veli Pekka Hyttinen Regional Marketing Manager, Central and Eastern Europe Jasna April 1, 2014

PSI Chemistry. 3) How many electron pairs does carbon share in order to complete its valence shell? A) 1 B) 2 C) 3 D) 4 E) 8

POC via CHEMnetBASE for Identifying Unknowns

A powerful site for all chemists CHOICE CRC Handbook of Chemistry and Physics

Aliphatic Hydrocarbons Anthracite alkanes arene alkenes aromatic compounds alkyl group asymmetric carbon Alkynes benzene 1a

PIOTR GOLKIEWICZ LIFE SCIENCES SOLUTIONS CONSULTANT CENTRAL-EASTERN EUROPE

Version 1.2 October 2017 CSD v5.39

Organic Chemistry 112 A B C - Syllabus Addendum for Prospective Teachers

Unit 5 Test. Name: Score: 37 / 37 points (100%)

Finding Polymer Information Part 2: Advanced. Dr. Thomas Haubenreich

Search for Substance Data

Organic Chemistry is the chemistry of compounds containing.

Molecular Geometry: VSEPR model stand for valence-shell electron-pair repulsion and predicts the 3D shape of molecules that are formed in bonding.

Hydrocarbons. Chapter 22-23

Name Date Class HYDROCARBONS

Elsevier R&D Solutions. Tool Sheet. Exploring a chemical reaction

Reaxys Medicinal Chemistry Fact Sheet

2/25/2015. Chapter 4. Introduction to Organic Compounds. Outline. Lecture Presentation. 4.1 Alkanes: The Simplest Organic Compounds

POC via CHEMnetBASE for Identifying Unknowns

Information Extraction from Chemical Images. Discovery Knowledge & Informatics April 24 th, Dr. Marc Zimmermann

NORTH CENTRAL HIGH SCHOOL NOTE & STUDY GUIDE. Honors Biology I

Reaxys Training. How to Find Organometallic and Coordination Compounds in Reaxys

OECD QSAR Toolbox v.4.1. Step-by-step example for building QSAR model

Chapter 2 The Chemistry of Life

Large Scale Evaluation of Chemical Structure Recognition 4 th Text Mining Symposium in Life Sciences October 10, Dr.

Chapter 6 Chemistry in Biology

video 14.4 isomers isomers Isomers have the molecular formula but are rearranged in a structure with different properties. Example: Both C 4 H 10

Chapter 13 Alkenes and Alkynes Based on Material Prepared by Andrea D. Leonard University of Louisiana at Lafayette

Thieme Chemistry E-Books

Chemistry 14CL. Worksheet for the Molecular Modeling Workshop. (Revised FULL Version 2012 J.W. Pang) (Modified A. A. Russell)

Chapter 25: The Chemistry of Life: Organic and Biological Chemistry

DRUG DISCOVERY TODAY ELN ELN. Chemistry. Biology. Known ligands. DBs. Generate chemistry ideas. Check chemical feasibility In-house.

The now-banned diet drug fen-phen is a mixture of two synthetic substituted benzene: fenfluramine and phentermine.

Completions Multiple Enrollment in same semester. 2. Mode of Instruction (Hours per Unit are defaulted) Hegis Code(s) (Provided by the Dean)

Chapter 22. Organic and Biological Molecules

Chapter 22 Hydrocarbon Compounds

TOK: The relationship between a reaction mechanism and the experimental evidence to support it could be discussed. See

GENERAL METHODS OF ORGANIC CHEMISTRY; APPARATUS THEREFOR (preparation of carboxylic acid esters by telomerisation C07C 67/47; telomerisation C08F)

Chapter 1 Reactions of Organic Compounds. Reactions Involving Hydrocarbons

Introduction to Spark

CHEM 203 Exam 1. Name Multiple Choice Identify the letter of the choice that best completes the statement or answers the question.

Carbon and Molecular Diversity - 1

Table 8.2 Detailed Table of Characteristic Infrared Absorption Frequencies

DEPARTMENT: Chemistry

BIOLOGY 101. CHAPTER 4: Carbon and the Molecular Diversity of Life: Carbon: the Backbone of Life

ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009

12.1 The Nature of Organic molecules

Structure and Reaction querying in Reaxys

QSAR Modeling of ErbB1 Inhibitors Using Genetic Algorithm-Based Regression

Open PHACTS Explorer: Compound by Name

Alkyl phenyl ketones are usually named by adding the acyl group as prefix to phenone.

Application Note. U. Heat of Formation of Ethyl Alcohol and Dimethyl Ether. Introduction

The Case for Use Cases

Fundamentals of Organic Chemistry

Name Date Class. aryl halides substitution reaction

UNIT 4 REVISION CHECKLIST CHEM 4 AS Chemistry

Organic Chemistry. Organic chemistry is the chemistry of compounds containing carbon.

DiscoveryGate. One place for the answers you need. Chulalongkorn University, Thailand

ORGANIC CHEMISTRY. Classification of organic compounds

MSC. ISMAIL M.ALI DEPARTMENT OF CHEMICAL ENGINEEING COLLEGE OF ENGINEERING TIKRIT UNIVERSITY

A stand alone calculator (not part of your cell phone) (preferably a scientific calculator).

OFFICE Room 3268; Tel Chemistry for Changing Times, 14 th Edition, -- Fourth Custom Edition for CCRI - by Hill and McCreary Other Supplies

Pipeline Pilot Integration

Chapter 25. Organic and Biological Chemistry. Organic and

How to add your reactions to generate a Chemistry Space in KNIME

Chemistry 1110 Exam 4 Study Guide

DEPARTMENT OF CHEMISTRY KHEMUNDI COLLEGE, DIGAPAHANDI; GANJAM Course Objective and Course Outcome

به نام خدا. New topics in. organic chemistry. Dr Morteza Mehrdad University of Guilan, Department of Chemistry, Rasht, Iran

SciFinder Premier CAS solutions to explore all chemistry MethodsNow, PatentPak, ChemZent, SciFinder n

Science of Synthesis Guided Examples

OECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance

ORGANIC CHEMISTRY. Fifth Edition. Stanley H. Pine

IR, MS, UV, NMR SPECTROSCOPY

Keynotes in Organic Chemistry

Organic Chemistry. Unit 10

Unit 7 ~ Learning Guide Name:

UNIT 1: BIOCHEMISTRY

Chapter 15 Reactions of Aromatic Compounds

2/25/2013. Electronic Configurations

Transcription:

DERWENT MARKUSH RESOURCE NOW AVAILABLE ON STN BRIAN LARNER IP & SCIENCE 10 Dec 2015

AGENDA Introduction to DWPI chemical structure indexing Coverage of Markush Indexing Indexing of organics Inorganics & organometallics Coverage of Polymers Substance descriptors Roles 2

WHAT IS CHEMICAL INDEXING? Markush structural indexing for generic structures Created for all Markush Structures that meet the criteria for being indexed Also created to cover generic disclosures described only in words (eg cleaning solution containing a 2-8C alcohol) DCR indexing for specific compounds Created for any specific compounds mentioned in the patent Some compounds may be covered in Markush structure if system limits are exceeded Fragmentation coding auto-generated from the above 3

DWPI structure databases on STN SUBX DWPIM > 1.9 M structures DWPI > 3.2 M patents DCR > 2.5 M structures REFX Each structure has a unique Markush Compound Number (MCN) or DCR number (DCR) which is used as the basis of the cross-file search. 4

DERWENT MARKUSH RESOURCE ON STN Approximately 1.9 million structures from around 780,000 patents Covers 33 patent issuing authorities (as basic patent country) Can be searched in conjunction with DCR, MARPAT and CAS REGISTRY on STN In most cases using the same structure query Gives the most comprehensive chemical structure search possible 5

DWPI COUNTRY COVERAGE FOR CHEMICAL INDEXING 6

CHEMICAL STRUCTURE INDEXING IN DWPI - CRITERIA To receive structural indexing a DWPI record must meet the following criteria Classified in Sections B, C and / or E From a major country* 7

WHAT IS INDEXED Compounds claimed to be new Compounds produced by a new process Compounds having a new use Components of compositions Novel catalysts and known specific catalysts Specific reagents and starting materials in production processes (DCR only) Materials detected, detecting agents, detection media Materials recovered or purified in new ways Materials removed and removing agents 8

TYPES OF STRUCTURES INDEXED Non-polymeric organic molecules Organometallic compounds Inorganic structures Simple inorganic molecules Extended structures such as clays, zeolites and heteropolyacids Partially defined structures Polymeric structures Only for pharmaceutical and agrochemical patents Includes peptides as well as synthetic polymers 9

CHEMICAL INDEXING GUIDELINES Markush structures are indexed from: the patent claims the embodiment if a 'wider disclosure' is indicated Specific compounds are indexed from: the claims the main (best) example further examples at the analyst's discretion if the patent claims do not contain specific compounds, the analyst selects representative compounds from the examples and embodiment 10

MARKUSH COVERAGE IN OLDER PATENTS Prior to the introduction of DCR in 1999/2000 the policy was different Both specific and generic structures were covered by Markush structures, often as part of the same structure Some commonly occurring compounds were indexed using Derwent Compound Numbers It was the analysts choice whether to use these or combine them into a Markush These have now been converted to DCR records and can found by a DCR search But are still included in the Derwent Markush Resource 11

ORGANIC MOLECULES IN THE DERWENT MARKUSH RESOURCE Generally speaking they are indexed as shown in the Patent Counter ions are sometimes ignored (but not in the example below) Derwent Markush Resource Version In the patent 12

WHY THE CORE STRUCTURE CAN DIFFER FROM THE ONE DRAWN IN THE PATENT Indexing conventions Keto-enol tautomerism (keto form is the preferred one) Amidine normalisation (amidine/guanidine groups have normalised bonds not single and double bonds) Use of DWPI markush terminology and shortcuts Use of Superatoms terms (CHK, ARY etc.) & shortcuts (CO2, SO3 etc.) Allowing for variable attachments Replace all the parts of the structure where the attachment can be made by a variable group Allowing for exceptions mentioned in the patent For example where at least one of R1 & R2 is not H Allowing for system limits Means sometimes one structure is split into 2 or more 13

SUPERATOMS AND THEIR MEANING (ORGANIC) Superatom Definition STN query node CHK Fully saturated alkyl chain Ak CHE CHY Carbon chain containing at least one double bond (no triple bonds) Carbon chain containing at least one triple bond (optionally with double bonds) CYC Non-aromatic carbocyclic ring Cb ARY Carbocyclic ring system containing at least one benzene ring or quinoid variant HEA 5 membered ring with 2 double bonds or 6 membered ring with 3 double bonds HET Any mononuclear heterocyclic ring other than HEA HEF Fused heterocyclic ring system Hy Ak Ak Cb Hy Hy See also: DWPIM Reference Manual, Table 3, Page 18. 14

SUPERATOMS AND THEIR MEANING (INORGANIC OR NON-SPECIFIC) Superatom Definition STN query node HAL Halogen excluding At X AMX Alkali(ne earth) metal M A35 Group 3 to 5 metal M TRM Transition metal M LAN Lanthanide (excluding Lanthanum) M ACT Actinide or other trans-uranic metal M MX Unspecified metal M XX UNK Unspecified group but not hydrogen, mostly used for unspecified substituent groups Unspecified group (no longer used but may be present in some older structures) See also: DWPIM Reference Manual, Tables 4-6, Pages 19-20. 15

SUPERATOMS USED FOR DISPLAY ONLY Superatom ACY DYE PRT PEG POL Definition Acyl group (derived from any organic acid, not just carboxylic) Undefined dye chromophore Protecting group Polymer end group Polymer group Please note Derwent Superatoms will become directly searchable in a subsequent release of the Derwent Markush resource on New STN See also: DWPIM Reference Manual, Table 6, Page 20. 16

ATTRIBUTES Attributes can be applied to Superatoms to restrict the scope of the group they describe. For carbon chain Superatoms (CHK, CHE, CHY) we have the following Describing chain length LOW (1-6C), MID (7-10C) & HI (>10C) Describing chain structure STR (Straight) & BRA (Branched) For ring Superatoms we have the following Type of ring system - MON (Monocyclic) & FU (Fused) Degree of saturation SAT (Saturated) & UNS (Unsaturated) MON & FU are not applied to HEA SAT & UNS are not applied to HEA and ARY See also: DWPIM Reference Manual, Table 18, Page 62. 17

EXAMPLE OF STRUCTURE THAT MUST BE SPLIT 18

WOULD BECOME 2 STRUCTURES G 1 O G 3 G 2 G 1 S G 3 G 2 Markush 1 Markush 2 G1 is the group R, G2 is the group Q and G3 represents NR1R2 (which can be a ring) 19

SUBSTANCE DESCRIPTORS (FILE SEGMENTS IN MMS TERMINOLOGY) These are assigned to all Markush structures You can use them to filter your results There are three types of substance descriptor Technology related define the technology area the structure relates to Structure related - define the type of structure the Markush describes Miscellaneous identifies a Markush which contains structure which are components of a composition At least one technology related Substance Descriptor and at least one structure related Substance Descriptor is applied to each Markush 20

SUBSTANCE DESCRIPTORS RELATING TO STRUCTURE C : Structure is a co-ordination complex (includes metallocenes) F : Any polymer not covered by P or N L : Oligomer M : Alloy (only indexed in Sections B and C) N : Natural Polymer (other than polypeptide) P : Polypeptide (3 amino acids or above) V : Ordinary organic compound (non-polymeric) W : Extended structure (eg zeolite) Z : Salt with at least one organic component 7 : Inorganic compound See also: DWPIM Reference Manual, Table 17, Page 56. 21

OTHER SUBSTANCE DESCRIPTORS Relating to technology area A : Patent is classified in Section A (polymers) it must also be classified in at least one section chosen from B, C or E) B : Patent is classified in Section B and / or C E : Patent is classified in Section E (Chemistry) Miscellaneous Y : Substance indexed is a component of a mixture It is possible for one Markush to cover two different components of the same composition See also: DWPIM Reference Manual, Table 17, Page 56. 22

SUBSTANCE DESCRIPTOR COMBINATIONS VZ : Compound and its salts VC : Compound and it s complexes CZ : Used when it is unclear if a compound is a salt or a complex VW : Fullerenes and other nanostructures VP : For tripeptides VL : For oligomers 23

INORGANIC STRUCTURES Salts are drawn as discrete ions with charges added whenever they are shown or can be easily deduced More complex structures are indexed by listing each element present as a separate entity with zero valency Compounds formed entirely of non metallic elements are mostly shown with covalent bonds in much the same way as for organics 24

EXAMPLE SIMPLE INORGANICS Markush of alkali metal hypochlorites, especially Sodium or Potassium 25

METALLOCENES Are indexed with the cyclopentadienyl or other π bonded ligands shown disconnected from the metal atom The valency on the metal is reduced by 1 for each bond to a cyclopentadienyl ring For example Ti in titanocene dichloride is shown as 2 valent (a +2 charge would be placed on the Ti atom) 26

PHTHALOCYANINES These are drawn fully normalized The central metal atom used to be bonded to all 4 N atoms but now (since 2000) it is disconnected 27

POLYMER OR OLIGOMER Substance Substance descriptors BC definition E definition Oligopeptide VP 3 amino acids 3 amino acids Polypeptide P >=4 amino acids >=4 amino acids Oligosaccharide L 3-6 sugar units 3-9 sugar units Polysaccharide N >= 7 sugar units >=10 sugar units* Other oligomer L 3-8 repeat units 3-9 repeat units Other polymer F >=9 repeat units >=10 repeat units* BC definition refers to definition used when indexing pharmaceutical and agrochemical patents (Sections B and / or C) E definition refers to the definition used when indexing general chemistry patents (Section E) If a patent is classified in Section E as well as Section B and / or Section C the BC definitions are used * Not indexed unless part of a dye molecule 28

SEARCHING FOR POLYMERS Addition polymers are indexed as the monomer with role Q applied For addition copolymers all the separate monomers are indexed with roles Q + M They can be indexed as part of one Markush structure Condensation polymers are indexed using the structural repeat unit when this is given in the document This structure would receive the substance descriptor F or N Condensation polymers are indexed in terms of the starting materials if the structure of the polymer is not given with roles Q + M 29

POLYMER DRUG CONJUGATES There are several ways this can be indexed depending on the information provided If the point of attachment is not shown then the Polymer and drug are normally indexed separately with roles Q + M If the polymer is not specific it may be indexed as a Markush with the POL superatom and the drug shown as disconnected entities and appropriate polymer substance descriptors If the point of attachment is shown then the drug is shown bonded to the Polymer (represented by the POL superatom in most cases) 30

ROLES OF MARKUSH RECORDS Role A C D M N P Q R U X Definition Compound is analyzed or detected Catalyst Detecting agent Component of a mixture (at least 2 components have been indexed) New compound Compound is produced or purified Compound defined in terms of starting materials Removing or purifying agent New use of compound Compound is removed See also: DWPIM Reference Manual, Table 15, Page 55. 31

THANK YOU! Customer Service For subscriptions, pricing and renewals http://ip-science.thomsonreuters.com/support/ Technical Support For access, content, searching, troubleshooting and technical issues. http://ip-science.thomsonreuters.com/techsupport Training For Thomson Innovation training options. http://ip.thomsonreuters.com/training/ti/ Contact Us US, Canada & Latin America Phone: +1 800 336 4474 ts.info.us@thomsonreuters.com Europe, Middle East and Africa Tel: +44 (0)20 7433 4000 ts.info.emea@thomsonreuters.com Japan Phone: +81 3 5218 6500 ts.info.jp@thomsonreuters.com Asia Pacific (Singapore office) Phone: +65 6411 6888 ts.support.asia@thomsonreuters.com 32

Derwent Markush Resource (DWPIM) now available on STN Part 2: Searching on STN

Agenda Introduction to new STN DWPI structure databases on new STN DWPIM precision search tools Search examples

New STN workflow is oriented around projects 3 To create a project, click the icon. Projects allow you to: Easily return to previous work Reuse common queries Update searches with the most current information

The new STN interface puts query, history and results at your fingertips 4 Structure Editor Query Builder panel History panel Results panel

Select databases 5....

6 DWPI structure databases on STN DCR contains specific structures from patents DWPIM contains generic structures from patents A single structure query can be searched in both DCR and DWPIM Use REFX Cross File Search to find DWPI references

DWPI structure databases on STN 7 SUBX DWPIM > 1.9 M structures DWPI > 3.2 M patents DCR > 2.5 M structures REFX Each structure has a unique Markush Compound Number (MCN) or DCR number (DCR) which is used as the basis of the cross-file search.

8 DWPIM allows comprehensive and precise structure searches Why does precision matter? All DWPIM answers are valid answers, regardless of precision settings When building a DWPIM query, anticipate acceptable candidate answers DWPIM precision search tools Generic Definitions, e.g. saturation, high or low carbon count, etc. Match Level: ATOM, CLASS, ANY Element Count and Element Count Level: Limited versus Unlimited

Match Levels in DWPIM mirror the specificity typical of patent Markush structures 9 Most specific Least specific ATOM CLASS ANY Methyl, Phenyl CHK, ARY XX ( R-node ) CHK = saturated alkyl ARY = aromatic carbocycle XX = represents concepts that cannot be drawn and are described in the patent using text, such as substituent, electron-withdrawing group, leaving group, linker, group to form a ring, etc.

10 Match Level functions the same for real atoms or generic variables in queries ATOM matches at the specific atom level CLASS matches at both the generic CLASS level and specific atoms ANY matches at ATOM and CLASS, including indefinitely defined substitutions MATCH LEVEL ATOM CLASS ANY Specific RETRIEVAL Specific or Generic Specific or Generic or Indefinite

11 Structure queries automatically include a default Match Level assignment This assignment is only taken into account when searching Markush databases The default settings for Match Levels are QUERY NODES Ring atoms Cb, Cy, Hy Chain atoms Ak DEFAULT MATCH LEVEL ATOM CLASS (Limited)

Search example 12 Search Question: R1 = Ring system 1 2 3 4 1 2 3 4 Thiophene: ML = Atom Benzene: ML = Atom Carbonyl: ML = Class Heterocycle: ML = Atom Default settings.

Prepare structure queries using the structure editor 13 By default ring atoms are set to Match Level ATOM and chain nodes are set to Match Level CLASS. This has no effect on DCR. Right mouse click to change Node or Bond attributes. Click OK to add the query to the structures tab of the history panel.

Option: turn Automatic Cross File Search ON 14 Automatic Cross File Search between DWPIM and DWPI is OFF by default. Check this box to enable automatic Cross File search.

Search the structure query and review DCR structures 15 Click on any structure to enlarge (zoom). Automatic Cross File Search is set ON.

Crossover using REFX to retrieve corresponding DWPI records with in-context DCR hit structures 16 Use the REFX operator to retrieve corresponding DWPI references (L2). DCR hit structures are included in the DWPI detailed display full view.

Repeat the structure query and review DWPIM structures 17 Click on a Markush compound number, e.g. 1235-24201, for detailed display views (next). Query relevant G-group hit fragments are represented as an assembled structure. Click on any structure to enlarge (zoom).

DWPIM detailed display Brief view 18 Unassembled DWPIM Markush base structure. Query relevant G- groups (G3, etc.). Hit fragments are combined to form the assembled structure. Hit fragments are highlighted.

19 Detailed display allows you to choose a preferred view Brief unassembled hit Markush base structure with complete hit G-groups related to the query Hit fragments within hit G-groups are highlighted Full unassembled hit Markush base structure with all G-groups, including those not related to the query Hit fragments within hit G-groups are highlighted

Settings for DWPIM default detailed display view 20 Default displays can be toggled between either of the two views in the detailed display. * * This setting only relates to CAplus.

Crossover using REFX to retrieve corresponding DWPI records with in-context hit structures from DWPIM 21 Use the REFX operator to retrieve corresponding DWPI references (L4). DWPIM hit structures in both Assembled and Full views are available.

Search history 22 Extending the search to include DWPIM, retrieved 12 additional, potentially relevant results (L4).

Search example using variable query nodes 23 Search Query: 1 2 3 4 1 2 3 4 = No R1 further = Ring substitution system on Ak (Locked). 1 2 3 4 Thiophene: ML = Atom Benzene: Carbocycle ML (Cb): = Atom ML = Atom Carbonyl: Alkyl (Ak): ML = Class Heterocycle: (Hy): ML = ML Atom = Atom Default settings.

Prepare structure queries using the structure editor 24 By default ring variables Cb and Hy are set to Match Level ATOM and the chain variable Ak is set to Match Level CLASS. This has no effect on DCR. Block substitution with the lock atoms tool. Click OK to add the query to the structures tab of the history panel.

Search history 25 Using variable query nodes with ATOM match, retrieved 15 additional, potentially relevant results (L6).

Example with specific node hits for Ak, Cb and Hy 26

Search example revisiting MATCH level 27 Search Query: 1 2 3 4 1 2 3 4 = No R1 further = Ring substitution system on Ak (Locked). 1 2 3 4 Thiophene: ML ML = Atom = Atom Carbocycle Benzene: (Cb): ML = ML Atom = Atom Class Alkyl Carbonyl: (Ak): ML = = Class Heterocycle: (Hy): ML ML = Atom = Atom Class Default settings.

Class Match Level relationship between Variable Query Nodes and the Generic Nodes (Superatoms) in DWPIM 28 Ak Cy Cb Hy CHK CHE CHY ARY CYC HEF HEA HET = Structure Editor Variable Query Node

STN variable query nodes retrieve DWPIM generic nodes 29 STN variable query nodes DWPIM retrieved generic nodes DWPIM generic nodes for Ak CHK CHE CHY DWPIM generic nodes for Cb ML = Class ARY CYC DWPIM generic nodes for Hy HEA HET HEF

STN node attributes retrieve DWPIM indexed attributes 30 STN node attributes, e.g. Ak ML = Class DWPIM retrieved attributes DWPIM alkyl (no limitation) CHK CHE CHY ML = Class DWPIM alkyl (low) CHK CHK LOW ML = Class DWPIM alkyl (straight) CHK CHK STR

STN variable query nodes retrieve DWPIM generic nodes 31 STN search query Typical DWPIM assembled hits 3 4 1 2 3 4 1 2 STN query nodes with Match Level Class, retrieve corresponding generic and specific nodes in DWPIM. 4 DWPIM attributes are also accessible, e.g. MON = monocyclic, FUS = Fused. 2 3 1

Prepare structure queries using the structure editor 32 Cb and Hy nodes have been set to Class match. Changes from defaults are indicated with an asterisk. This has no effect on DCR. Right click on a node to change Attributes, e.g. Match Level. Click OK to add the query to the structures tab of the history panel.

Search history 33 Using variable query nodes with CLASS match, retrieved 80 additional, potentially relevant results (L8).

Example with generic node hits for Cb and Hy 34

35 Element Count and Element Count Level Any specific atom, chain, or ring in a query has an implied element count, e.g. phenyl ring = 6 C, isopropyl = 3 C The variable nodes Ak, Cb, Cy and Hy can have element counts added (example on next slide) For class MATCH, when an element count is implied or added, you can also set the Element Count Level to Limited (default) or Unlimited Limited (default) only retrieves patents that have a specified element count in the indexed structures

Example: assigning an Element Count, and Match Level CLASS, Element Count Level LIMITED, to an Hy node 36 A Right click on a node to change Attributes. B

Match Level CLASS, Element Count Level LIMITED 37 Query -CH2CH2CH3 (Locked) -Ak (no element count) -Ak (element count 1-4 C) Example retrieval -CH2CH2CH3, npr, CHK LOW, CHK LOW,C=1-4 -CH3, Et, npr, CHK LOW,C=1-4, CHK, CHE, CHY -CH3, COOH, Et, CHK LOW, CHE LOW, CHK LOW,C=1-4 -Q N, Cl, B, HAL, TRM N (Locked) N, HEA C=2-5, N=1-3, HEAC=4-5, N=1-2 -Hy (1 N, zero O, zero S), HEA N=1-3, HEF N=0-5,S=0-5 -Cy (no element count) N, CYC, ARY, HET, HEA N=1-3, HEF N=0-5,S=0-5 N

Match Level CLASS, Element Count Level UNLIMITED 38 Query -CH2CH2CH3 (Locked) -Ak (no element count) -Ak (element count 1-4 C) Example retrieval -CH2CH2CH3, npr, CHK LOW, CHK LOW,C=1-4, CHK -CH3, Et, npr, CHK LOW,C=1-4, CHK, CHE, CHY -CH3, COOH, Et, CHK LOW, CHE LOW, CHK LOW,C=1-4, CHK, CHE -Q N, Cl, B, HAL, TRM N (Locked) N, HEA C=2-5, N=1-3, HEA C=4-5, N=1-2, HEA -Hy (1 N, zero O, zero S), HEA N=1-3, HEF N=0-5,S=0-5, HEA, HEF -Cy (no element count) N, CYC, ARY, HET, HEA N=1-3, HEF N=0-5,S=0-5 N

New STN will allow simultaneous search with a single query structure 39 STN search conventions preserved: Bond types & values Generic nodes Attributes Match Level Element Count Level Query Structure DCR DWPIM MARPAT CAS Registry SM ~ 2.5 M structures ~ 1.9 M structures ~ 1.1 M structures ~ 100 M structures

Multi-database search example Diazepam 40 Note: Since the focus of this example is multi-database searching, default settings are used with a simple Closed Substructure Search (CSS) query. Click OK to add the query to the structures tab of the history panel.

Search the structure query and review structures 41 Closed Substructure Search (CSS). Query relevant G-group hit fragments are represented as an assembled structure. Click Submit to retrieve multidatabase structure hits (L1). Click on any structure to enlarge (zoom).

Crossover using REFX to retrieve corresponding DWPI and CAplus bibliographic records 42 Use the REFX operator to retrieve corresponding DWPI and CAplus references (L2).

Use Create Term List to identify unique hits 43

Use Create Term List to identify unique hits (cont.) 44 Manage Term lists. L2 = CAplus and DWPI combined search results. Q38 = patent number/kind taken from DWPI (L2). Patent records only found in CAplus (L3).

Resources 45 DWPIM Reference Manual Recorded Events http://www.stn-international.com/recorded_events.html Unified Markush Search on new STN MARPAT on new STN Substance and Chemical Structure Searching in CAS Registry and DCR on new STN

For more information CAS help@cas.org Support and Training: www.cas.org FIZ Karlsruhe helpdesk@fiz-karlsruhe.de Support and Training: www.stn-international.de