In previous lecture. Shannon s information measure x. Intuitive notion: H = number of required yes/no questions.

Similar documents
RIBOSOME: THE ENGINE OF THE LIVING VON NEUMANN S CONSTRUCTOR

Lecture IV A. Shannon s theory of noisy channels and molecular codes

Objective: You will be able to justify the claim that organisms share many conserved core processes and features.

CODING A LIFE FULL OF ERRORS

Aoife McLysaght Dept. of Genetics Trinity College Dublin

Using an Artificial Regulatory Network to Investigate Neural Computation

Genetic Code, Attributive Mappings and Stochastic Matrices

Edinburgh Research Explorer

Reducing Redundancy of Codons through Total Graph

Biology 155 Practice FINAL EXAM

A p-adic Model of DNA Sequence and Genetic Code 1

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certifi cate of Education Advanced Subsidiary Level and Advanced Level

Natural Selection. Nothing in Biology makes sense, except in the light of evolution. T. Dobzhansky

A modular Fibonacci sequence in proteins

CHEMISTRY 9701/42 Paper 4 Structured Questions May/June hours Candidates answer on the Question Paper. Additional Materials: Data Booklet

PROTEIN SYNTHESIS INTRO

A Minimum Principle in Codon-Anticodon Interaction

Genetic code on the dyadic plane

Lect. 19. Natural Selection I. 4 April 2017 EEB 2245, C. Simon

Lesson Overview. Ribosomes and Protein Synthesis 13.2

A Mathematical Model of the Genetic Code, the Origin of Protein Coding, and the Ribosome as a Dynamical Molecular Machine

The degeneracy of the genetic code and Hadamard matrices. Sergey V. Petoukhov

Mathematics of Bioinformatics ---Theory, Practice, and Applications (Part II)

Natural Selection. Nothing in Biology makes sense, except in the light of evolution. T. Dobzhansky

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

THE GENETIC CODE INVARIANCE: WHEN EULER AND FIBONACCI MEET

Translation Part 2 of Protein Synthesis

Ribosome kinetics and aa-trna competition determine rate and fidelity of peptide synthesis

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

Analysis of Codon Usage Bias of Delta 6 Fatty Acid Elongase Gene in Pyramimonas cordata isolate CS-140

ATTRIBUTIVE CONCEPTION OF GENETIC CODE, ITS BI-PERIODIC TABLES AND PROBLEM OF UNIFICATION BASES OF BIOLOGICAL LANGUAGES *

Three-Dimensional Algebraic Models of the trna Code and 12 Graphs for Representing the Amino Acids

The Physical Language of Molecules

Laith AL-Mustafa. Protein synthesis. Nabil Bashir 10\28\ First

From gene to protein. Premedical biology

Translation. Genetic code

The genetic code, 8-dimensional hypercomplex numbers and dyadic shifts. Sergey V. Petoukhov

Videos. Bozeman, transcription and translation: Crashcourse: Transcription and Translation -

From Gene to Protein

Crystal Basis Model of the Genetic Code: Structure and Consequences

Molecular Biology - Translation of RNA to make Protein *

Foundations of biomaterials: Models of protein solvation

From DNA to protein, i.e. the central dogma

In Silico Modelling and Analysis of Ribosome Kinetics and aa-trna Competition

Casting Polymer Nets To Optimize Molecular Codes

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA

Slide 1 / 54. Gene Expression in Eukaryotic cells

Abstract Following Petoukhov and his collaborators we use two length n zero-one sequences, α and β,

Flow of Genetic Information

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Get started on your Cornell notes right away

The Genetic Code Degeneracy and the Amino Acids Chemical Composition are Connected

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Chemistry Chapter 26

In Silico Modelling and Analysis of Ribosome Kinetics and aa-trna Competition

GENETICS - CLUTCH CH.11 TRANSLATION.

BME 5742 Biosystems Modeling and Control

Practical Bioinformatics

Lecture 5. How DNA governs protein synthesis. Primary goal: How does sequence of A,G,T, and C specify the sequence of amino acids in a protein?

ومن أحياها Translation 1. Translation 1. DONE BY :Maen Faoury

2015 FALL FINAL REVIEW

What is the central dogma of biology?

Chapter 17. From Gene to Protein. Biology Kevin Dees

Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis

UNIT 5. Protein Synthesis 11/22/16

1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine.

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA

Organic Chemistry Option II: Chemical Biology

L I F E S C I E N C E S

Gene Finding Using Rt-pcr Tests

Chapter

2013 Japan Student Services Origanization

NSCI Basic Properties of Life and The Biochemistry of Life on Earth

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers

Multiple Choice Review- Eukaryotic Gene Expression

Degeneracy. Two types of degeneracy:

An algebraic hypothesis about the primeval genetic code

Lecture 7: Simple genetic circuits I

GCD3033:Cell Biology. Transcription

Molecular Biology (9)

Molecular Genetics Principles of Gene Expression: Translation

1. In most cases, genes code for and it is that

NIH Public Access Author Manuscript J Theor Biol. Author manuscript; available in PMC 2009 April 21.

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

Biochemistry Prokaryotic translation

Supplementary Information for

Translation and the Genetic Code

Transcription Attenuation

CHAPTER4 Translation

Computational Biology: Basics & Interesting Problems

BIOLOGY 161 EXAM 1 Friday, 8 October 2004 page 1

Regulation and signaling. Overview. Control of gene expression. Cells need to regulate the amounts of different proteins they express, depending on

Chapters 12&13 Notes: DNA, RNA & Protein Synthesis

9/11/18. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Midterm Review Guide. Unit 1 : Biochemistry: 1. Give the ph values for an acid and a base. 2. What do buffers do? 3. Define monomer and polymer.

Marshall Nirenberg and the discovery of the Genetic Code

Crick s early Hypothesis Revisited

9/2/17. Molecular and Cellular Biology. 3. The Cell From Genes to Proteins. key processes

Transcription:

In previous lecture Shannon s information measure H ( X ) p log p log p x x 2 x 2 x Intuitive notion: H = number of required yes/no questions. The basic information unit is bit = 1 yes/no question or coin flip. For two variables X, Y we can measure how much they tell about each other. Information has thermodynamic meaning: One can produce mechanical work from information (Maxwell s demon). Maximum Entropy yields the most probable distribution. 1

II. Living Information Overview: molecules, neurons, evolution, population.

Biological information is carried by molecular recognition Living systems I. Self-replicating information processors. Environment II. III. Evolve collectively. Made of molecules. Generic properties of molecular channels subject to evolution? Physical/information theory approach?

Living Information - Overview: molecules, neurons, population and evolution Living systems as information processors: Sources of information in Life: sequential information, cell composition, environment, population composition. Living information channels: (modernized) central dogma, replication, codes, receptors signaling pathways, population dynamics, quorum sensing. Information processing: circuits and their elements, neural networks, genetic networks. Information output: transcription level, decisions, cell fate, differentiation and development; feedback (information loop).

DNA Building blocks : Sequential information in DNA and proteins 4 nucleic bases = {A, T, G, C}. Building blocks: 20 amino acids. protein Polymer: DNA double-helix. Inert information storage ( tape ) Polymer = protein. Functional molecules ( constructor )

RNA intermediates can be both tapes and machines DNA RNA protein Primordial RNA world : RNA molecules are both information carriers (DNA) and executers (proteins).

Information in cellular composition Proteome, metabolome, lipome and other x-omes. Information can be represented as a composition vector. What is the relevant information?

Information in population structure R 2 H bit

Living Information - Overview: molecules, neurons, population and evolution Living systems as information processors: Sources of information in Life: sequential information, cell composition, environment, population composition. Living information channels: (modernized) central dogma, replication, codes, receptors signaling pathways, population dynamics, quorum sensing. Information processing: circuits and their elements, neural networks, genetic networks. Information output: transcription level, decisions, cell fate, differentiation and development; feedback (information loop).

Information in molecules: The central dogma The central dogma of molecular biology (Crick 1958, Nature 1970): Information about DNA sequence cannot be transferred back from protein to either protein or nucleic acid. In M. Nirenberg words: DNA makes RNA makes protein. How sequence information is transferred between information-carrying biopolymers? 3 carriers: DNA, RNA protein and 3 3 = 9 potential transfers: (i) (ii) 3 general transfers (occur in most cells). 3 special transfers (only under specific conditions) (iii) 3 unknown transfers (believed never to occur). General transfers: replication, transcription, translation.

From Crick s draft (1956)

special information transfers Reverse transcription (RNA DNA): In retroviruses (HIV) and eukaryotes (retrotransposons and telomere synthesis). RNA replication (RNA RNA): Many viruses replicate by RNA-dependent RNA polymerases (also used in eukaryotes for RNA silencing). Direct translation (DNA protein): demonstrated in extracts from E. coli which expressed proteins from foreign DNA templates.

Transfers outside the central dogma Posttranslational modification Protein amino acid sequence edited after translation by various enzymes. Methylation Changes in methylation of DNA alter gene expression levels (usually DNA methylase). Heritable change is called epigenetic. Effective information change but not primary DNA sequence Prions Proteins that propagate by making conformational changes in other molecules of the same protein. Information propagated is protein conformation.

Post-translational modifications of proteins : extends functionality by attaching other groups (e.g. acetate) changes chemical nature of amino acids. structural changes (disulfide bridges). enzymes may remove amino acids or cut the peptide chain in the middle. s

Epigenetic information transfer

Prions transfer folding state Prions propagate by transmitting mis-folded state. Chain reaction: conversion of properly folded proteins to prion form. Amyloid fold: polymer of tightly packed beta sheets Amyloids are fibrils grow at their ends and replicating by breaking.

Information transfer by inheritance: Self-replication Proposed demonstration of simple robot self-replication, from advanced automation for space missions, NASA conference 1980.

Von Neumann s universal constructor Self-reproducing machine: constructor + tape (1948/9). Program on tape: (i) retrieve parts from sea of spares. (ii) assemble them into a duplicate ; (iii) copy tape.. (1966)

Von Neumann s design allows open-ended evolution Motivated by biological self-replication: Construction universality. Evolvability. Key insight (before DNA) separation of information and function. Tape is read twice: for construction and when copied. How to design fast/accurate/compact constructor? mutations Implementation by Nobili & Pesavento (1995)

Ribosomes translate nucleic bases to amino acids Ribosomes are large molecular machines that synthesize proteins with mrna blueprint and trnas that carry the genetic code. protein large subunit Aminoacid trna genetic code amino-acid= (codon) small subunit Anti-codon trna mrna Goodsell, The Machinery of Life

Noise Ribosome needs to recognize the correct trna mrna Ribosome ϕ(c 4 ) ϕ(c 5 ) ϕ(c 1 ) ϕ(c 2 ) ϕ(c 3 ) protein ϕ(c i ) c i trnas c 1 c 2 c 3 c 4 c 5 c 6 c 7 c 8 c 9 c 10 c 11 c 12 trnas ϕ(c i ) c i Ribosome c j Accept trna Reject trna (i) binding wrong trnas: amino-acid (ii) unbinding correct trnas: amino-acid (codon) (codon) How to construct fast\accurate\small molecular decoder?

Noise Decoding at the ribosome is a molecular recognition problem trnas ϕ(c i ) c i Ribosome c j Accept trna Reject trna Central problem in biology and chemistry: How to evolve molecules that recognize in a noisy environment? (crowded, thermally fluctuating, weak interactions). How to estimate recognition performance ( fitness )? What are the relevant degrees-of-freedom? Dimension? Scaling? What is the role of conformational changes?

Ribosome sets physical limit on self-reproduction rate Large fraction of cell mass is ribosomes. For self-reproduction each ribosome should self-reproduce. Sets lower bound on self-reproduction rate. T 4 massribo 10 amino-acids R C 20 amino-acids/sec 500 sec mass ribo Fastest growing bacteria (Clostridium perfringens): T ~ 600 sec. T Problem: how ribosome accuracy affects fitness depends on (i) (ii) Basic protein properties (mutations). Biological context (environment etc.). error R W rate RC

Challenge of molecular coding Quality (Distortion) Molecular recognition in a noisy, crowded milieu. Many competing lookalikes. Weak recognition interactions ~ k B T. Synthesis of reliable organisms from unreliable components (von Neumann, Automata Stud., 1956) Cost (Rate) How to construct the molecular codes at minimal cost of resources.. Rate-distortion theory (Shannon 1956) D Goodsell

Genetic Code The genetic code is main information channel of Life S DNA 4 letters..acggagguaccc. M Protein 20 letters Thr Glu Val Pro Genetic code: maps 3-letter words in 4-letter DNA language (4 3 = 64 codons) to protein language of 20 amino acids. Proteins are amino acid polymers which perform most biological functions Diversity of amino-acids is essential to protein functionality.

The genetic code maps codons to amino-acids Molecular code = map relating two sets of molecules (spaces, languages ) via molecular recognition. Spaces defined by similarity of molecules (size, polarity etc.) amino 20 amino-acids acid trna UUU UUA 64 codons UCU UCA UAU UAA UGU UGA codon UUC UUG UCC UCG UAC UAG UGC UGG CUU CUA CCU CCA CAU CAA CGU CGA Genetic Code CUC AUU CUG AUA CCC ACU CCG ACA CAC AAU CAG AAA CGC AGU CGG AGA AUC AUG ACC ACG AAC AAG AGC AGG GUU GUA GCU GCA GAU GAA GGU GGA GUC GUG GCC GCG GAC GAG GGC GGG

The genetic code is a smooth mapping Amino-acid polarity M S Degenerate (20 out of 64). Compactness of amino-acid regions. Smooth (similar color of neighbors). Generic properties of molecular codes? -- to withstand noise at a minimal cost.

The genetic code maps DNA to protein Genetic code: maps 3-letter words in 4-letter DNA language (4 3 = 64 codons) to protein language of 20 amino acids. codon =, {A, T, G, C}. b1b 2b3 b i (codon) amino-acid. Genetic code embeds the codon-graph (Hamming graph) into space of amino-acids. G A T C 3 ϕ Genetic code charge polarity size Translation machinery, whose main component is the ribosome, facilitates the map.

Signal transduction Signal transduction: signaling molecule activates specific receptor on cell membrane. messenger transmits the signal into the cell.

Quorum sensing Quorum sensing: stimulus and response correlated to population density. Many bacteria use quorum sensing to vary gene expression according to the density of their local population. Some social insects use quorum sensing to locate the nest. Used in robot flocks.

Living Information - Overview: molecules, neurons, population and evolution Living systems as information processors: Sources of information in Life: sequential information, cell composition, environment, population composition. Living information channels: (modernized) central dogma, replication, codes, receptors signaling pathways, population dynamics, quorum sensing. Information processing: circuits and their elements, neural networks, genetic networks. Information output: transcription level, decisions, cell fate, differentiation and development; feedback (information loop).