Zhou Pei-Yuan Centre for Applied Mathematics, Tsinghua University November 2013 F. Piazza Center for Molecular Biophysics and University of Orléans, France Selected topic in Physical Biology Lecture 1 The basic constituents of biological matter
The elements of living matter
Chemical bonds: covalent, ionic
hydrogen and van der Waals Induced-dipole dipole interaction r 6
Important energy scales in biological interactions 1 kcal/mol = 4.184 kj/mol At room temperature (298 K) k B T = 0.59 kcal/mol = 0.025 ev
The main molecules of life (A) DNA 4 nm (B) Hemoglobin PDB: 1hh0 (C) Phosphatidylcholine (D) Branched carbohydrate PDB: 1cap
Nucleic acids and proteins are polymer languages with different alphabets DNA Sentences (genes) written by combining three-letter words (codons) (on 2% of its length!) Transcription Translation PROTEINS Sentences (3D folds) written by combining words (secondary structure motifs). 21-letter alphabet.
DNA alphabet: the 4 nucleotides (bases) R is a larger chemical group (sugar) Purines Pyrimidines A Adenine G Guanine C Cytosine T Thymine CG: three hydrogen bonds AT: two hydrogen bonds
Genetic code A gene is copied on a mrna molecule, which is then read by the ribosome to produce the corresponding protein U uracil Very similar to thymine (T) and is used on mrna instead
Modeling biological elements: identifying the right length and time scales in the problem DNA
Proteins can be thought of in terms of different idealizations
Membranes can be thought of in terms of different idealizations
E. Coli idealizations
The structure of biological macromolecules can be obtained with great precision by combining X-ray crystallography, NMR spectroscopy and cryo-electron microscopy. By and large (at least to the structural biologist) a structure is a set of atomic coordinates r i = x i î + y i ĵ + z iˆk However, the atoms are constantly jiggling: r i (t) =x i (t)î + y i (t)ĵ + z i (t)ˆk Three representations of triose phosphate isomerase (PDB code 3tim)
Protein structures are organized in a hierarchical manner
Parts of the structure of macromolecules can be classified based on chemical groups Macromolecules acquire functional sense when one recognizes that atoms are grouped together to form specific chemical groups. Proteins are formed by sequences of amino acids, whose specific sequence determines in a unique manner their 3D fold and biochemical properties Only C and H: unable to form H bonds with water, thus hydrophobic O or S plus H. Highly reactive (make covalent bonds). Often found in active sites in enzymes (SER, THR, TYR, CYS) Charged groups, often at the core of chemical specificity in molecular recognition. Also N and C terminus Participate in a variety of H bonds Post-translational modification: added to proteins as basic regulatory unit
The backbone of proteins is the same for all of them: the peptide bond The peptide bond and the two adjacent C α atoms are arranged in a planar configuration
A key player in protein structure and dynamics: the hydrogen bond (H bond) Hubbard, R. E. and Kamran Haider, M. 2010. Hydrogen Bonds in Proteins: Role and Strength. els. Hydrogen bonds provide most of the directional interactions that underpin protein folding, protein structure and molecular recognition. The core of most protein structures is composed of secondary structures such as α helix and β sheet. This satisfies the hydrogen bonding potential between main chain carbonyl oxygen and amide nitrogen buried in the hydrophobic core of the protein A hydrogen bond is formed by the interaction of a hydrogen atom that is covalently bonded to an electronegative atom (donor) with another electronegative atom (acceptor). Typical energies for H bonds in proteins: 1.9 6.9 kcal/mol
The key concepts 1. Hydrogen bonding confers rigidity to the protein structure and specificity to intermolecular interactions. 2. The accepted (and most frequently observed) geometry for a hydrogen bond is a distance of less than 2.5 Å (1.9 Å) between hydrogen and the acceptor and a donor hydrogen acceptor angle of between 90 and 180 (160 ). 3. During protein folding, the burial of hydrophobic side chains requires intramolecular hydrogen bonds to be formed between the main chain polar groups. 4. The most stable conformations of polypeptide chains that maximize intra-chain hydrogen bonding potential are helices and sheets. 5. Specificity in molecular recognition is driven by the interaction of complementary hydrogen bonding groups on interacting surfaces.
Hydrophobic amino acids: tend to be found in the bulk (interior)
Polar and charged amino acids: able to make hydrogen bonds. They tend to be found on the surface and/or in secondary structure motifs
Models or cartoons? The act of drawing the Molecular cartoons found in molecular biology books and research papers reflects choices about which features of the problems are really important and which can be ignored In other words: Draw and then lay out the mathematics
An example: the mitochondria, ATP generating machines with their own DNA
Cryo-electron microscopic 3D reconstruction of a mitochondrion reveals its internal structure
2D or 3D representation? The conceptual elements are the same: 1. Mitochondria are closed, membrane bound organelles 2. The inner membrane has a complex folded structure (this greatly increases the total surface area available to membrane-bound ATP-producing machinery.
Conversion of chemical energy: high-energy electrons into ATP Schematic illustration of the main proteins involved in the electron transport machinery leading to ATP synthesis MESSAGE: this cartoon represents an abstraction of the mitochondrion that emphasizes a completely different set of components and concepts
The math behind the models: on the springiness of (biological) stuff Energy = 1 2 k x2 Force = k x This simple model comes in biology in virtually all contexts, disguised in many ways 1. Optical tweezers, atomic force microscopy 2. Bending of DNA 3. Beating of the flagellum of a swimming sperm 4. Membrane fluctuations 5. Protein dynamics 6. In biochemical reactions molecules accumulate in potential wells on energy landscapes 7. Changes in gene expression over time Mechanical models Non-mechanical models
Coarse-grained models Image courtesy of
A Course-Grained DNA Model for Supercoil and Bubble Formation Mehmet Sayar, KOC University
Structure-based coarse-grained models of proteins A special class of models that we will be considering at length: network models The powerful idea: atomic details are not important in reproducing large-scale movements of the proteins. All atoms from a given amino acid one fictive particle with the total mass at the C α site H NM = N i=1 p 2 i 2m i + i>j Pairwise potentials V ij (r i, r j, r 0 i, r 0 j) Instantaneous positions Equilibrium positions (parameters)
The essential toolbox of physical models in biology Simple harmonic oscillator and harmonic fluctuations of proteins conformations Ideal gas and ideal solution models Two-level systems and the Ising model Random walks, entropy and macromolecular structure Elastic theory of 1D rods and 2D membranes Newtonian fluid model and the Navier-Stokes equation Diffusion and random walks Rate equation models of chemical kinetics
The role of estimates Geometric mean rule for estimates. For cases in which we do not know the relevant magnitude, a powerful scheme is to guess an upper and lower bound and take their geometric mean x x lower x upper Application of the geometric mean rule to guess the size of bacteriophages