Chemical Information Retrieval CAS & SciFinder Searching for Substances In the past two lessons you worked with SciFinder to search for articles from the chemistry literature. In this exercise we will use look for information about substances rather than research topics. CAS Registry Numbers In the Chemical Abstracts databases, substances are uniquely identified by their CAS Registry Numbers. Registry numbers are much better than names or formulas for identifying a substance, because they are unique. If you know the Registry Number of a specific compound, you can search for that number in the CA database and be certain of finding only information about that particular compound. The first thing we need to learn is how to find the Registry Number of a substance. There are over 26,000,000 Registry Numbers for organic and inorganic compounds, so guessing one would be tough! Luckily, Registry Numbers are so commonly used to identify substances that you will find them in most chemistry databases. Although you can use SciFinder to search for Registry Numbers, it is worthwhile to look for them in other databases first. You may not always have SciFinder available when you need it. Finding Registry Numbers: Wolfram Alpha Let s find the Registry Number of the compound shown at the right. We don t know its name, but can start with its molecular formula, C 4 H 6 N 4 O 3 S 2. A convenient source of information about organic compounds is Wolfram Alpha http://www.wolframalpha.com/ Chemical Information Retrieval Page 1
Enter the molecular formula for the compound, C4H6N4O3S2, in the query space. This finds the compound, acetazolamide. Its CAS Registry Number is 59-66-5. Finding Registry Numbers: Sigma-Aldrich Database Sigma-Aldrich is a chemical company with a significant database of compounds. Its web site is another good place to find Registry Numbers. http://www.sigmaaldrich.com Enter the molecular formula in the Search box Chemical Information Retrieval Page 2
In the information about the compound, click on Properties. The CAS Registry Number will be one of the properties displayed. SciFinder: Structure Search In this problem, you are a research scientist in a laboratory who is exploring the properties of chemicals that are potential sweeteners. By analogy with other compounds, you think that a chemical with the structure at the right is a potential sweetener. You want to search the chemical literature to see (1) if this compound has been prepared before; (2) if it is a potential sweetener; and (3) if there are English-language descriptions of nonpatented methods for its preparation. For the sake of this exercise, suppose that you have looked at the Wolfram Alpha and Sigma- Aldrich databases and have not found the compound there. In this case, you will need to go to SciFinder In SciFinder, click on Explore Substances at the top. You should see the Chemical Structure Search option displayed. Click on Click to Edit Chemical Information Retrieval Page 3
Draw the structure on the editing window. Select Exact search in the lower right corner of the window. Then click on OK. In the next window that appears, click on Search. Chemical Information Retrieval Page 4
We obtain a lot of references. Sorted by relevance we see that many of them are optical isomers of each other. The L-, L- isomer with Registry Number 22839-47-0 seems to be important. It is the second substance in the list and a component of the first. Click on Experimental Properties for this compound. We see that the compound is a sweetener. Click on the check mark in the Preparation row and Nonpatents column to see a list of references involving the synthesis of this compound. You can refine the list of preparations to limit them to English-language publications. Chemical Information Retrieval Page 5
Multicomponent Substances When we were searching for this compound, you may have noticed that some substances you found had our compound as a component. About 10% of the CAS substances database consists of multicomponent substances. 1 The above substances (5910-52-1 and 106372-55-8) are two of them. It is interesting that SciFinder lists salts as multicomponent systems. If for example you search for sodium sulfate, you will that the compound is represented as the sodium salt of sulfuric acid. 1 D. R. Ridley. Introduction to Structure Searching with SciFinder Scholar. American Chemical Society. Chemical Information Retrieval Page 6
Starting With a Compound Name Dr. Bass has pointed out an interesting compound named "vinblastin," which is obtained from a natural product. From what plant is vinblastin extracted? For what purpose(s) is vinblastin used in medicine? In the Explore Substances section, select Substance Identifier. Enter vinblastin in the text box. SciFinder only returns one compound with Registry Number 865-21-4. Its name is vincaleukoblastine. Note the icons that send you to a list of references, reactions, etc. Chemical Information Retrieval Page 7
We are interested in locating references about how the compound is obtained and used. Select the compound by checking the box by its name. Click on Get References. In the window that appears, elect Preparation and then Get. Use the list of references to identify the plant from which vinblastin is isolated. We also want to learn about how vinblastin is used in medicine. Click on substances in the breadcrumb trail to return to the substance itself. Click on Get References again. This time, select Uses in the Limit results to list. Read the abstracts of the first dozen or so references. What is vinblastin used for in medicine? Chemical Information Retrieval Page 8
Vinblastin appears to be biologically active. We can learn more in the Substance Detail section. Click on this link. In the Substance Detail section, you will find a link for Bioactivity Indicators. Which one of the bioactivity indicators for vinblastin has a much larger number of references than the others? Does this agree with the application of vinblastin to medicine that you found earlier? Starting With a Registry Number In the first few pages of this lesson, we searched Wolfram-Alpha and Sigma-Aldrich for the CAS Registry Number of the compound on the right. We found it to be 59-66-5. Let s suppose we need the C-13 NMR spectrum of this compound. We can use this Registry Number and SciFinder to find the spectrum. Go to the Explore Substances and enter the Registry Number as the Substance Identifier. Chemical Information Retrieval Page 9
Click on the Spectra link under the structure and name of the compound. Three different sources of the C-13 NMR are available. Refining by Structure Suppose you are interested in the cyclohexylmethylium ion. One way of finding this ion is to search for it by molecular formula, C 7 H 13 This search returns over 120 structures way too many to look through yourself. We need to refine the search, Chemical Information Retrieval Page 10
Under the Refine tab, click on the Chemical Structure image. Draw the structure, select Exact search, and then OK You will see the structure in the Refine section. Click on Refine at the bottom of the section. This narrows the list down to around 11 substances, which you can look through to find the ion in which you are interested. Chemical Information Retrieval Page 11
Using Greek Letters When you are exploring research topics or substances, you may have occasion to enter a Greek letter. For example: β-alanine α-alkylation π-bond You can express a Greek letter by spelling out the name of the letter and surrounding it with periods..beta.-alanine.alpha.-alkylation.pi.-bond Searching for Salts As I alluded to on page 6, the CAS databases express salts rather archaically as multicomponent systems. For example, the formula of sodium sulfate in the CAS databases is H2O4S.2Na. The best way to search for a salt is to find its Registry Number in another database and search for the Registry Number. Chemical Information Retrieval Page 12