Molecular Genetics Principles of Gene Expression: Translation

Paper No. : 16 Module : 13 Principles of gene expression: Translation Development Team Principal Investigator: Prof. Neeta Sehgal Head, Department of Zoology, University of Delhi Paper Coordinator: Prof. Namita Agrawal Department of Zoology, University of Delhi Content Writer: Dr. Sudhida Gautam, Hansraj College, University of Delhi Dr. Kiran Bala, Deshbandhu College, University of Delhi Content Reviewer: Dr. Surajit Sarkar, Department of Genetics, South Campus, Delhi University 1

Description of Module Subject Name Paper Name ; Zool 016 Module Name/Title Module Id Keywords Principles of Gene Expression: Transcription 13; Translation Tanslation,genetic code, codon, wobble hypothesis, polycistronic, initiation, elongation, termination, trna, acceptor arm, anti codon arm Contents 1. Learning Outcomes 2. Introduction 3. Genetic Code 4. Wobble Hypothesis 5. Translation Ingredients 5.1 Charging of trna 5.2 Initiation 5.3 Elongation 5.4 Termination 6. Summary 2

1. Learning Outcomes Second part of the central dogma of molecular biology The concept of genetic code, i.e. triplet codon. Wobble hypothesis Conversion of genetic code to polypeptide. RNA directed protein synthesis. Mechanism involved in protein synthesis 2. Introduction Translation: The process of protein synthesis The central dogma of molecular biology explains that DNA codes for RNA, which codes for proteins (Figure 1) DNA is the molecule of heredity that passes from parents to offspring. It contains the instructions for building RNA and proteins, which make up the structure of the body and carry out most of its functions. Figure 1: Central Dogma Source: https://www.studyblue.com/notes/note/n/lecture-3-macromolecules/deck/15443630 Translation is the second step of central dogma which describes how amino acids are produced from the genetic code. A process where ribosomes synthesize proteins using the mature mrna transcript produced during transcription. Proteins are the building blocks of body, composed of amino acids linked together by peptide bonds. It involves binding of ribosome (thought of as a moving proteinsynthesizing machine) near the 5 end of mrna and moving towards the 3 end. Protein synthesis takes place through a series of sequential interaction of RNA-RNA molecule, mrna and the rrna that hold the mrna in the ribosome. The way in which a sequence of DNA bases is transcribed into complementary RNA bases and then translated into corresponding amino acids is illustrated in the image below (Figure 2). It takes place in the cytoplasm and the nucleotide sequence of the mrna is converted to the amino acid sequence of a polypeptide. 3

Figure 2: Gene to protein Source: https://online.science.psu.edu/sites/default/files/biol110/tutorial17_triplet_code.jpg) 3. Genetic Code The sequence of nucleotides namely: adenines, guanines, cytosines and thymines join together in sets of three to code for twenty known amino acids. George Gamow proposed that each amino acid has three sets of nucleotide which are known as the codon. He concluded it by a bit of armchair logic, stating that number of words possible with four letter are 1 (using all four letters), 16 (4 2, using two letters) and 64 (4 3, using 3 letters). As, there were 20 different amino acids he concluded that each codon must be containing 3 nucleotides (Figure 3). A single base cannot code for one amino acid because this would give rise to only four amino acids. If we consider a two base code it provides only sixteen possibilities, thus a minimum of three bases is needed to specify a single amino acid. Francis Crick, Sydney Brenner, and colleagues at Cambridge University, soon confirmed the findings of Gamow i.e. the triplet nature of codon. Because there are 61 sense codon it was soon revealed that a single amino acid is coded by more than one codon. Two adjectives that are often used to describe the genetic code are "unambiguous" and "redundant." Unambiguous means that the codons are fixed and that each codon specifies one amino acid. For example, ACC codes for tryptophan and nothing else. Several codons may code for a single aminoacid i.e. redundant. For example, CAA, CAC, CAG, and CAT all code for a single amino acid (valine). 4

Figure 3: Triplet codon Source: https://online.science.psu.edu/sites/default/files/biol110/tutorial17_genetic_code.jpg Characteristic features of Genetic code: a. The genetic code is triplet in nature i.e. three nucleotides code for a single amino acid. b. The genetic code is continous- there is no break while reading the mrna. mrna is read as a comma free stretch of three nucleotides without skipping any nucleotide. c. The genetic code does not overlap. d. The genetic code is degenerate- Each amino acid has more than one codon, with an exception of AUG for Methionine and UGG for tryptophan. e. The start (AUG) and stop (UAA, UAG, UGA) codon are same for both prokaryotes as well as eukaryotes. The genetic code was cracked by Marshall Nirenberg and Johann Heinrich Matthaei (1961), they determined the codon (sequence of bases) for phenylalanine. Enzyme polynucleotide phosphorylase was used to create synthetic RNAs. Enzyme was able to link any RNA nucleotide without the need of a template. The first synthetic RNAs used by them were homopolymers (contained only a single type of nucleotide). They tested a polynucleotide phosphorylase consisting exclusively of poly U. Thus they obtained a RNA molecules consisting of entirely uracil nucleotides which had only UUU codons. These poly U RNAs were added to 20 experimental containing all 20 amino acids and components necessary for translation. In each test tube one individual amino acid was radiolabelled, which varied in each test tube. On observing the test tubes they found that radioactive protein appeared only in one test tube which contained labelled phenylalanine (Figure 4). Thus, it was proved that UUU codes for 5

phenylalanine. Later on codon were followed for lysine, glycine and proline. Today we know the sequence of bases which code for each amino acid found in proteins. 4. Wobble Hypothesis Figure 4: Nirenberg and Matthaei s experiment to determine Genetic code Source: https://prezi.com/5xwqsjnonjq6/the-nirenberg-and-matthaei-experiment/ Francis Crick (1966) gave the wobble hypothesis according to which the there is a flexibility of base at the third position of the codon. For example; alanine is coded by GCU, GCA, GCC and GCG. One should note that all codon begins with GC but the bonding at third position varies, this non standard pairing is referred is referred to as the wobble base (Figure 5). 6

Figure 5: Normal and Wobble pairing Source: http://www.biomers.net/en/products/dna/sequence_modifications.html 5. Translation Ingredients Three essential components of translation include mrna (as template), trna (transporter of amino acid) and ribosome subunits (machinery to assemble the proteins). mrna (template)- the mrna codon recognizes the trna anticodon. mrna is complementary to the gene template and carries the genetic information from the DNA to the ribosome. The nucleotides of mrna are read in a series of triplets which code for a corresponding amino acid and are referred to as a codon. Each amino acid has a specific codon. The genetic code consists of triplet, continous, nonoverlapping and universal codons. This code is redundant or degenerate because a single amino acid can have more than one codon. The start and stop codon are same for all the amino acids. AUG is the initiation (start) codon. Three stop codons are UGA, UAG and UAA. trna (transporter of amino acid)- trna has a L-shaped tertiary structure and cloverleaf like secondary structure bearing four arms formed by the coiling of a single strand of 80 ribonucleotides (Figure 6). It plays the role of an interpreter between nucleic acid and pepetide sequences by picking up amino acids which corresponds to the proper codons in mrna. Based on base-base complementarity, the secondary structure of trna can be drawn as the familiar cloverleaf structure of 4 loops, each consisting of four to seven Watson-Crick type base pairs. Five regions of the trna are not base paired, the acceptor loop/3 end, the DHU-loop (named for the presence of dihydrouridine), the anti-codon loop which interacts with the mrna, the "extra arm" and the TψC loop (named for the presence of the pseudouridine base. Acceptor arm/ 3 end: The amino acid which corresponds to the mrna codon is attached at the 3 end of trna which has the sequence CCA. The proper amino acid is joined to the trna by the enzyme aminoacyl-trna synthetase. There is one type of aminoacyl-trna synthetase for each amino acid and the active site of each fits only the specific combination of the proper amino acid 7

and trna. This binding of the trna and amino acid is highly specific and the amino acid binds to the adenine nucleotide at the end of the trna.. Anticodon arm: Present at the bottom of the loop having three nucleotides (anticodon) which are complimentary to the mrna codon. A single trna can recognize different codons of an amiono acid. There is a relaxation in the Base pairing between the anticodon and codon in the third position. This relaxation is referred to as the wobble hypothesis and the position of this third nucleotide is called the wobble position. TψC loop: Presence of a pseudouridine base. DHU loop: Presence of dihydrouridine base pair. Figure 6: Structure of trna Source: http://higheredbcs.wiley.com/legacy/college/boyer/0471661791/structure/trna/trna.htm Ribosomes (machinery to assemble proteins)- consists of two parts, a large subunit and a small subunit made up of rrna and proteins. Functional ribosome is formed when the two units attach to each other. Each ribosome has three binding sites for mrna (Figure 7). 1. The p-site (peptidyl site) 2. The a-site (aminoacyl site) 3. The e-site ( exit site) 8

Process of translation Figure 7: Structure of Ribosome Source: https://online.science.psu.edu/biol011_sandbox_7239/node/7395 Translation consists of four major events- 5.1. Charging of trna The amino acid and ATP bind to the specific aminoacly-trna synthetase enzyme to give aminoacyttrna. Aminoacyl-tRNA synthetase binds amino acid to trna molecule. There are 20 different aminoacyl-trna synthetases which are unique for each amino acid. Each synthetases is able to recognize its own unique amino acid based on different shape, size, charges and R group of the amino acid. This reaction takes place in two steps utilizing the energy from an ATP molecule. First the aminoacyl-amp-enzyme binds to the uncharged trna. The amino acid is transferred to the trna accompanied with the release of AMP molecule. The enzyme returns to its original configuration and is ready to bind with next upcoming amino acid and ATP molecule to start the process all over again (Figure 8). 9

Figure 8: Charging of trna Source: http://www.mun.ca/biology/scarr/igen3_06-10.html 5.2. Initiation Involves mrna,ribosome, specific initiator trna, protein factors, GTP and magnesium ions. : Involves an mrna molecule, ribosome units, initiation factors, initiator trna charged with N- formylmethionine and GTP. The small subunit (30S) of the ribosome binds to a site "upstream" (on the 5' side) of the AUG called the ribosome-binding site (RBS). This mrna RBS in prokaryotes is commonly known as Shine-Dalgaro sequence. This pyrimidine- rich region pairs with the 3 end of 16S rrna (part of the small subunit of ribosome). If a mutation occurs in shine-dalgaro sequence or the corresponding 16SrRNA, it results in failure of pairing, for a particular mrna. As a result the process of translation does not take place for that particular mrna. No corresponding sequence occurs in the eukaryotic mrna. The small subunit of eukaryotic ribosome along with the initiation factors recognizes the cap at the 5 end of mrna which play a crucial role in the initiation of translation. IF3 prevents the binding of the larger 9sub unit of ribosome prior to mrna binding (Figure 9) (Table 2). The complex of IF2, GTP and trna (charged with N-formylmethione) binds to small ribosome subunit along with IF-1 to form the 30S initiation complex. In the final step of initiation IF3 is released which facilitates the binding of large subunit of ribosome. The charged trna on the p-site is still attached and the larger ribosomal subunit joins with the smaller unit giving rise to the 70S initiation complex. Hydrolysis of GTP gives GDP and Pi and releases IF1 and IF2. This 10

initiation complex proceeds downstream (5' -> 3') until it encounters the start codon AUG (The region between the mrna cap and the AUG is known as the 5'-untranslated region [5'-UTR]). The recognition of the start codon in eukaryotes is facilated by presence of a concensus sequence known as the Kozak sequence (5 -ACCAUGC-3 ). Initiator trna is the only trna which can directly bind at the P-site. The P site is so-named because, with the exception of initiator trna, it binds only to a peptidyl-trna molecule; that is, a trna with the growing peptide attached to it. Figure 9: Initiation of Translation Source: http://www.scielo.cl/scielo.php?pid=s0716-97602005000200003&script=sci_arttext&tlng=en 5.3. Elongation Aminoacyl-tRNA binds to the ribosome and forms a peptide bond between the two amino acids. The translocation of trna from P-site to E-site is followed by its release from E-site and binding of new aminoacyl-trna at the A-site. It helps in building the polypeptide chain one by one with the corresponding amino acid as deciphered from the template mrna (Figure 10). It consists of three prominent steps: a) Binding of ribosome unit with charged trna: An aminoacyl-trna (a trna covalently bound to its amino acid) able to base pair with the next codon on the mrna arrives at the A site. This charged trna is associated with an elongation factor (called EF-Tu in bacteria; EF-1 in eukaryotes) and GTP (the source of the needed energy). Binding of the trna at a-site results in GTP hydrolysis and release of (EF- Tu) elongation factor. Elongation factor EF-Ts helps to regenerate EF-TuGDP to EF-TuGTP. 11

b) Formation of peptide bond between the amino acid: The ribosome maintains the correct position of the amino acids present in trna s so that a peptide is formed between them. The bond between the amino acid and the trna in the P-site is cleaved first. Secondly, a peptide bond is formed between the released fmet and amino acid attached to trna at a-site. Once the bond is formed trna at P-site becomes uncharged and the trna at A-site has the first two amino acids. Formation of the peptide bond is a result of the catalytic activity of the rrna in the larger subunit of ribosome. Thus this rrna acts as a ribozyme. Translocation of ribosome along the mrna one codon at a time: Movement of the ribosome along 5-3 direction of the mrna positions it over the next codon, where a new charged trna can come and bind. As the ribosome moves downstream of mrna the trna s at P-site and A-site are still attached to the mrna thorugh the codon-anti-codon pairing. The shifting of ribosome moves the trna from P-site to E-site and from A-site to P-site, respectively. This shift also vacates the A-site, which is now ready to receive a new charged trna as specified by the codon. 5.4. Termination Figure 10: Translation elongation Source: http://www.bio.miami.edu/dana/250/250ss10_9.html Signalled by the stop codons which are UGA, UAG and UAA. Once the ribosome recognizes the stopcodon with the help release factors it triggers release of polypeptide from the trna, release of trna from the ribosome and dissociation of the ribosome complex. Translation stops when the ribosome encounters a stop codon (Figure 11). There are no trna molecules with anticodons for STOP codons. When ribosome reaches a STOP codon no trna enters the A-site. Releasing factor (RF-1) bind at the A-site whereas RF-3 coupled with GTP attaches to the larger subunit of ribosome. The release factor RF-1 recognizes the stop codons UAA and UAG, whereas release factor RF-2 12

recognizes UAA and UGA. A third release factor, RF-3, is also required for translational termination. The polypeptide is cleaved from the trna in the P-site along with the hydrolysis of the GTP. The ribosome splits into its subunits, which can later be reassembled for another round of protein synthesis. 6. Summary Figure 11: Translation termination Source: http://www.nobelprize.org/educational/medicine/dna/a/translation/termination.html Translation is the process protein synthesis which makes use of mrna produced by transcription. It takes place in the cytoplasm, mrna is read from the 5 to 3 end and the polypeptide is formed in N-terminal to C-terminal direction. Twenty different proteins are used in the formation of proteins which are linked together by peptide bonds. In total there are 64 codons out of which 61 are sense codons and three are stop codons. The third base of each codon is highly flexible and referred to as the wobble position. Various components are aminocytl-trna synthetase, mrna, trna and ATP molecule to provide energy. Ribosomes contain a binding site for mrna and two binding sites for trna located in the large ribosomal subunit. The genetic code is a triplet codon which guides the binding of individual amino acids. Genetic code is degenerate, non-overlapping, universal and redundant. Translation takes place in four phases charging of trna, initiation, elongation and termination. AUG is the first amino acid, it binds to the trna with the help of enzyme amino-acyl synthetases makinf use of GTP as the energy soruce. This charged trna is referred to as the initiator trna 13

(has unique property to bind at the P-site). Prokayotes have N-formymethionine (fmet) as the first amino acid whereas in eukaryotes it is Methionine (Met). First 30S initiation complex is formed, from which release of IF3 vacates the site for 50S (large) ribosomal subunit to form the 70S initiation complex. Elongation proceeds in the 3 direction; an incoming charged trna attaches at the vacant A site. A peptide bond is formed between the first and second amino acid, connecting the amino acid of the trna in the P site to the amino acid of the trna in the A binding site. As translation proceeds the ribosome moves along the mrna and the trna shifts from p-site to e-site and a-site to p-site, respectively. This vacates the a-site for the binding of the next charged trna. This pattern continues and the ribosome translates the mrna molecule until it encounters a termination codon on the mrna. Termination takes place on encountering the STOP codons which are UAA, UAG and UGA. When a ribosome reaches a STOP codon, the a-site accepts a protein called a release factor instead of a trna. The release factor breaks the bond between the trna and the polypeptide. As a result, the growing polypeptide chain is released from the trna molecule and the entire machinery disassociates. The newly formed polypeptide chain gives rise to a functional protein after undergoing several modifications. Often it is observed a single mrna is being translated by several ribosomes simultaneously which are referred to as a polyribosomes or polysomes. 14