The Select Command and Boolean Operators Part of the Jmol Training Guide from the MSOE Center for BioMolecular Modeling Interactive version available at http://cbm.msoe.edu/teachingresources/jmol/jmoltraining/boolean.html Introduction When Jmol is first launched, every command you enter into the Command Line affects the entire molecular structure. However, by using the Select command, you can specify portions of a molecular structure and change the Display Formats or Display Colors for just that region. The select command on its own will not change the way your molecular structure is displayed. Rather, it simply designates what atoms in the molecular structure all future commands will effect. In other words, once you select an area of your molecular structure, all future commands will only be applied to that area. Each time the select command is used, Jmol will respond by posting the "# atoms selected" in the console. This is an excellent way to judge whether you have selected what you had intended to. Note that unless otherwise indicated, this section of the Jmol Training Guide uses the protein "Top 7" based on the.pdb file 1qys.pdb. Please see the Getting Started in Jmol section of the Jmol Training Guide for information on how to download and open.pdb files. The Select Command Below is a collection of selection types commonly used when designing a molecular visualization with Jmol. Each example includes the Jmol command needed to apply the selection as well as the command to help visualize what area of the molecular structure was selected. Note that for the following select sections, the Top 7 protein will be shown with a backbone of 1.5 and a wireframe of 0.5.
All Select all is the default selection when Jmol is first launched. With select all, every atom contained in a.pdb file is selected and will be affected by future commands. Note that when we show two Jmol commands on two separate lines, as shown below, it implies that you will submit the first command by hitting "enter" or "execute" before entering the second command. To select all atoms: select all Note that you can practice these commands yourself using your own copy of Jmol running on your desktop. To reset the visualization to its default selection and representation, use the commands: select all color cpk Backbone Select backbone selects the atoms in a protein that are part of the backbone. Only backbone atoms will be affected by future commands. To select backbone atoms: select backbone Sidechain Select sidechain selects the atoms in a protein that are part of the sidechains (sometimes called r-groups). Only sidechain atoms will be affected by future commands. To select sidechain atoms: select sidechain
Hydrophobic Select hydrophobic selects the amino acids in a protein that are hydrophobic, including Alanine, Leucine, Valine, Isoleucine, Proline, Phenylalanine, Methionine, and Triptophan. Only the atoms in these amino acids will be affected by future commands. To select hydrophobic amino acids: select hydrophobic Polar Select polar selects the amino acids in a protein that are not hydrophobic, including Cysteine, Glycine, Serine, Threonine, Lysine, Aspartic Acid, Asparagine, Glutamic Acid, Glutamine, Tyrosine, and Histidine. Only the atoms in these amino acids will be affected by future commands. To select polar amino acids: select polar Helix Select helix selects the amino acids in a protein that are part of alpha helix secondary structures. Only the atoms in helices will be affected by future commands. To select helices: select helix
Sheet Select sheet selects the amino acids in a protein that are part of beta pleated sheet secondary structures. Only the atoms in sheets will be affected by future commands. To select sheets: select sheet Nucleic (DNA/RNA) Select nucleic selects the atoms in a structure that are part of DNA or RNA nucleotides. Only the atoms that are part of DNA or RNA will be affected by future commands. Note that the Top 7 protein does not contain any nucleic acids and therefore nothing is colored when the "select nucleic" command is used. A zinc finger protein based on the file 1zaa.pdb does contain nucleic acids and therefore will have areas colored magenta. To select nucleics: select nucleic Water Select water selects the amino acids in a protein that are part of beta pleated sheet secondary structures. Only the atoms in sheets will be affected by future commands. Note that the Top 7 protein does not contain any nucleic acids and therefore nothing is colored when the "select water" command is used. A hemoglobin protein based on the file 1a3n.pdb does contain water and therefore will have areas colored magenta. To select nucleics: select water To remove all water from your display: restrict not water
Chain Many protein structures have more than one polypeptide chain (sometimes refered to as the protein's quaternary structure). These chains are labled with single letter identifiers in the.pdb file and can be selected by entering a colon (:) followed by the letter identifier. Only the atoms in chain you select will be affected by future commands. Note that the Top 7 protein does not contain multiple polypeptide chains and therefore nothing is colored when the "select :b" command is used. A hemoglobin protein based on the file 1a3n.pdb contains four chains and therefore will have areas colored magenta. To select chain B: select :b Amino Acid Numbers and Ranges Each amino acid in a polypeptide chain is assigned a sequential number. You can use these number identifiers to select individual amino acids or ranges of amino acids. Note that there can be more than one amino acid with a given number, if there are more than one polypeptide chains in a structure file. For example, if a molecular structure has both a chain A and a chain B, there can be an amino acid 10 on chain A as well as an amino acid 10 on chain B. To select amino acid 30: select 30 To select amino acids 30 through 50: select 30-50
Atom Numbers and Ranges Each atom in a molecular structure is assigned a sequential number. You can use these number identifiers to select individual atoms or ranges of atoms. Note that unlike amino acid numbers, there are no repeated atom numbers in a structure. Even if you have more than one chain, there will be no repeated atom numbers between the two chains; each atom number is uniquely paired with one and only one atom. To select atom 100: select atomno=100 To select atoms 100 through 500: select atomno>99 and atomno<501 Small and Unique Molecules Some molecular structures include additional small molecules that are neither proteins nor nucleic acids. These molecules are given a three character alphanumeric identifier that can be used to select them. To determine what this three character identifier is, you can click on it in the Jmol display window, or review the structure summary for the.pdb file you are using on the RCSD Protein Databank (www.pdb.org) website. Note that the Top 7 protein does not contain any of these small molecules and therefore nothing can be colored to demonstrate this command. A hemoglobin protein based on the file 1a3n.pdb contains four chains, each of which has a small molecule called a Heme Group and therefore will have areas colored magenta. To select heme groups: select hem
Identifying Structures in your Display If you do not know the amino acid number, chain letter, amino acid type, or atom number of the item you want to select, you can click on the structure in the Display Window. Jmol will provide information on the Console Window regarding the atom you clicked on. The image below decodes this information in detail. Boolean Operators Even with all of the predefined selection types discussed above, you may still have trouble selecting the exact collection of atoms that you are interested in. Once you are comfortable using the basic Select command, you can begin to link together selections using Boolean Operators. To understand how boolean operators work, imagine that you are working with two predefined selection types, helices and backbone. These two selection types are shown in a Venn diagram below. Some atoms in our Top 7 protein are part of helical secondary structures (the left blue circle), some atoms are part of protein backbone (the right yellow circle), and some atoms will fall where the circles overlap because they are part of both selection types. Each example includes the Jmol command needed to apply the selection as well as the command to help visualize what area of the molecular structure was selected. A Venn diagram is also included for each, with diagonal shading to represent what section of the diagram is being selected.
No Boolean Operators As a starting point, we will review simple select commands that do not use any boolean operators. To select helices: select helix To select backbone: select backbone The "or" Boolean Operator You can use the or boolean operator to select all the atoms that are either part of helices or the atoms that are part of the backbone. Therefore, the command or selects all of the atoms that are in one group or the other. To select atoms that are part of either helices or the protein backbone: select helix or backbone The "and" Boolean Operator You can use the and boolean operator to select all the atoms that are both part of the helices and part of the protein backbone at the same time. Therefore, the command and selects all of the atoms that are in both groups at the same time. To select atoms that are part of helices and the protein backbone at the same time: select helix and backbone
The "not" Boolean Operator You can use the not boolean operator to select only the atoms that are not part of helices or not part of the backbone. The not command simply inverts the selection type you are using. To select all atoms except helices: select not helix To select all atoms except backbone: select not backbone To select all atoms except helices or backbone: select not helix and not backbone 0 Atoms Selected! When using the select command with boolean operators, occasionally you will see "0 atoms selected" in the console window. This means that no atoms exist in the structure that satisfy the selection type requested. For example, imagine trying to select atoms that are in alpha helices and beta pleated sheets at the same time. In the diagram, you see that these are two select groups are mutually exclusive sets of atoms. There is no overlap in the selection groups and are no atoms that fulfill this selection, causing Jmol to respond with "0 atoms selected" and nothing to be colored magenta. To select atoms that are part of helices and sheets at the same time: select helix and sheet
Use of Parentheses Similar to algebraic order of operations, parentheses in Jmol can be used to insure that the proper selection groupings occur. Below are a few sample commands that are frequently used in designing models. To select the sidechain and the alpha carbon of amino acid 62: select 62 and (sidechain or alpha) To select the backbone of either residue 10 or residue 25: select (10 or 25) and backbone Parentheses can be used in a select command when combining operators to insure that the proper selection occurs. Here are some sample commands that are frequently used in designing models: select :b and (his63 or his92) and (sidechain or alpha) This selects the sidechain and alpha carbon atoms in two residues, his63 and his92, on chain b of the protein. select (:a or :c) and sheet this command selects the sheets in chains a and c of the molecule.