2018 Biochemistry 110 California Institute of Technology Lecture 2: Principles of Protein Structure Linus Pauling (1901-1994) began his studies at Caltech in 1922 and was directed by Arthur Amos oyes to study a new technique X-ray crystallography. e earned his Ph. D. in 1925 and continued his studies on the crystallography of molecules including biologically important molecules from 1927-1963. From his research, he published The ature of the Chemical Bond in 1939. is research also elucidated the alpha-helix and beta-sheet repeating systems central to protein structure and determined the molecular basis of sickle-cell anemia in 1949. During 1958-1963, Pauling fought to halt nuclear testing and illuminated the dangers of radiation to the American public. ead more about Pauling and more scientists at: Profiles in Science, The ational Library of Medicine, profiles.nlm.nih.gov Much of what we will learn today was first discovered in Pauling s labs at Caltech.
Proteins are Built from Amino Acids Free amino acids are zwitterions in solution: K b = [B ]*[ ] [B:] 2 C 2 pk b = 4.1 pk a = 2.3 Un-ionized form + 3 C 2 - pk a = 9.9 Zwitterionic form K a = [ ]*[A ] [A] The difference of 7.6 units between the pka of an amino acid carboxylic acidcarboxylate group and the pka of an amino acid amino-ammonium ion group is so great that the amino acid form would not be appreciably present in aqueous solution. So even if we draw the 2 -C-C 2 form, the zwitterionic form is understood. (The pk b of a base and the pk a of the conjugate acid are related by pk a = 14 - pk b.) (To simplify things, we will discuss all biochemistry topics using pka s.) All natural encoded amino acids are L-stereochemistry: The L-stereochemistry of amino acids derives from the stereochemistry of L- glyceraldehyde. All natural amino acids are L- and, for all but cysteine, this is the S-stereochemistry as shown. priority 3 priority 3 S- - ( in back) ( in back) C 2 C 2 2 C priority 2 priority 2 C priority 1 priority 1 L-(S)-Glyceraldehyde L-(S)-Proline (natural) D-()-Proline (unnatural)
The 20 atural Amino Acids Glycine, Gly (G) = Alanine, Ala (A) = C 3 C 3 most flexible linkage 2 C 2 2 C 2 Valine, Val (V) = iso-propyl a beta-branched aa staggered conf. shown 2 C 2 Leucine, Leu (L) = iso-butyl Why isn t the isobutyl group on Isoleucine? 2 C 2 Isoleucine, Ile (I) = sec-butyl it s beta-branched the sidechain is (S)- C3 C 2 2 C 3 C 2 Proline, Pro (P) = -C 2 C 2 C 2 - the only cyclic aa called an imino acid C 2 Serine, Ser (S) = C 2 good for nucleophilic catalysis 2 C 2 Threonine, Thr (T) = C(Me) it s beta-branched the sidechain is ()- 2 C 3 C 2 Cysteine, Cys (C) = C 2 S L-Cys is ()- because S has priority over S 2 C 2 Methionine, Met (M) = -C 2 C 2 SMe C 3 2 S C 2 All the amino acids on this page have the one-letter code = first letter of the name. All on this page and next except Gly have the same L- stereochemistry. 18 are (S)- and one is ()- Ala, Val, Leu, Ile, Pro, Cys and Met are all considered hydrophobic aa s. (Gly is not considered hydrophobic because it has no sidechain protecting the polar groups.)
Phenylalanine, Phe (F) = benzyl hydrophobic aa 2 C 2 The 20 atural Amino Acids (continued) Tyrosine, Tyr (Y) = p-hydroxybenzyl phenolic pka10.9 2 C 2 istidine, is () = -C 2 -imidazole weakly basic (pka6.0) 2 C 2 Tryptophan, Trp (W) = -C 2 -indole 2 C 2 Lysine, Lys (K) = -[C 2 ] 4-2 basic (pka10.8) amine form shown ammonium (protonated) 2 2 C 2 Arginine, Arg () -[C 2 ] 3 -C(=) 2 basic (pka12.5) guanidine form shown guanidinium (protonated) 2 2 C 2 C Aspartate, Asp (D) 2 = -C 2 C 2 acidic (pka4.1) 2 C 2 Asparagine, Asn () = -C 2 C 2 2 C 2 C 2 Glutamate, Glu (E) = -C 2 C 2 C 2 acidic (pka4.1) 2 C 2 C 2 Glutamine, Gln (Q) = -C 2 C 2 C 2 2 C 2 C 2 ne-letter code for all except is () have to be memorized. Three basic aa s + Two acidic aa s.
Genetic Code Allows Mutations to Wander Amongst Similar Groups Examples of single-base mutations: Val! Ile Phe! Tyr Phe! Leu Glu! Asp Leu! Pro (notice Leu has 2 regions) Arg! Lys (Arg has two regions also) Ser! Cys (If doesn t work, ->S) Examples of amino acid changes that require 2 or more base mutations: Met! Trp (AUG -> UGG) (two) Ser! Glu (Ser has 2 regions) Ser! Asp (but neither can get to GAx) Phe! Glu (requires 3 mutations)
Primary Structure: Proteins ave Linear Polypeptide Chains Formation of the Peptide Bond: 3 1 + 3 3 + 2 2 2 Equilibrium lies on the side of hydrolyzed amino acids. But aqueous hydrolysis is slow without enzymes. Main Chain = backbone; Sidechains = groups Planar Amide Units Central to Peptide Structure 1 Peptide linkages are very nearly always transoid dihedral angle ω = 180 : i ω = 180 (a dihedral angle of 0 would be sterically disfavored) owever, for Proline, the trans- amide bond isomer (left) is only slightly more stable than the cis- isomer in aqueous solution: ω = 180 ω = 0
The Peptide Backbone as Limited otational Freedom φ ψ Follow the peptide! C reading first φ (phi), then ψ (psi). Measure the dihedral angle of the continuing chain in the direction terminus! C terminus. The figure above (and at right) depicts a fully extended chain for which φ = ψ = 180. (+ 180 and -180 are the same angle.) For the angle φ (phi), the horseshoe conformation (with φ = -120 ) is favored significantly and φ = +120 is especially bad when : amachandran Plot: We can plot specific regions of peptide structure on a plot of φ (x-axis) vs. ψ (y-axis). At right is shown the regions we have discussed so far: +180 +120 +60 ψ (psi) φ= -120 0-60 -120-180 amachandran Plot "fully extended" "orseshoe region" φ= +120-180 -120-60 0 +60 +120 +180 φ (phi) Sterically indered for
Secondary Structure: The ight-anded alpha-elix A helix is formed when all C= groups are made to point the same general direction. The righthanded alpha-helix is formed when φ = -60 and ψ = -60. We can sight first along φ then along ψ and determine these angles as the chain proceeds in the to C direction. φ ψ φ = ψ = 180 φ = -60 Sighting along -Cα the group approximately eclipses the : φ = -60. Sighting along Ca -C the alpha-ydrogen eclipses the C: ψ = -60. As the chain continues upward with a right-handed twist, the helical structure develops with the C of each amino acid subunit (position i) hydrogen bonding to the of the amino acid subunit four units along the chain (position i+4): i i+1 i+3 i+2 i+4 C C i+5 i+1 i+2 ψ = -60 This gives the alpha helix with 3.6 residues per turn.
Secondary Structure: Parallel and Anti-parallel Beta Sheets β-strand structures: A fully extended subunit can rock the sidechain in and out of the plane while keeping the C and groups vertical. φ = ψ = 180 φ ψ φ = -120 ψ = +120 Such β-strands with φ = -ψ have no helicity. fully extended " β-strand structures! When non-helical β-strands line up with each other, beta-sheet structures are formed: i i i+2 i+4 i+1 i+3 i+5 i+2 i+4 i+1 i+3 i+5 Parallel β-sheet i i+2 i+4 i+1 i+3 i+5 i+5 i+4 The β-sheet structure is often called the β-pleated sheet as the φ =120 /ψ = -120 conformational preference above imparts a bend in each segment of the sheet. In each parallel and anti-parallel β-sheets, the sidechains are arranged in an alternating fashion above and below the plane of the sheet. i+3 i+2 i+1 Anti-parallel β-sheet i
amachandran Plots of Proteins If we plot the regions of the amachandran plot that we have so far discussed, we get the plot at right. At bottom right is a plot of 100,000 data points from X-ray structures of many proteins. Clearly, there is a quite generous region for α-helices as well as both parallel and anti-parallel β-sheets. We can also see the classical amachandran Plot diagram with the shapes that appear in many textbooks. This diagram accounts for other secondary structural motifs that are found occasionally in peptides and proteins including the π-helix, the 3 10 -helix, the left-handed α-helix and the collagen triple helix. +180 +120 +60 ψ (psi) 0-60 -120-180 "fully extended" flat β-sheet "orseshoe region" α -180-120 -60 0 +60 +120 +180 φ (phi) zero helicity Sterically indered for
Secondary Structure: Types of Beta-Sheets and Beta-Turns Antiparallel β-sheets frequently reverse direction by including a turn sequence of two or more amino acids. The most common turn motifs are the Type I and Type II β-reverse turns and the closely related Type I and Type II β-hairpin turns. Type I and Type II β-turns are related by amide plane inversion and display all four sidechains (i to i+4) on the same side of the β-sheet. i i+3 Type I i+1 i+2 i i+3 Type II Both parallel and antiparallel beta sheets can adopt structures displaying twists, barrels and combinations of parallel and anti-parallel strands. Alpha helices and beta sheets can also combine in special architectures that make up key domains of proteins. Below, α-chymotrypsin shows an antiparallel β-barrel, carbonic anhydrase shows a β-saddle and triosephosphate isomerase shows a parallel beta-alpha-beta barrel. i+2 i+1 α-chymotrypsin Carbonic Anhydrase (1CA2) Triosephosphate isomerase (β) Triosephosphate isomerase (1TIM) These are examples of Protein Tertiary Structure.