Exploring the Sequence Dependent Structure and Dynamics of DNA with Molecular Dynamics Simulation Sarah Harris School of Physics and Astronomy University of Leeds
Introduction Calculations of the charge transport properties of complex biomolecules such as DNA are extremely difficult in general due to both the size and flexibility of these systems. Describe some relevant results from classical MD simulations: i) Insight into sequence and environment dependent DNA structure ii) A description of the dynamic properties of DNA and the methods developed to quantify them. iii) A preliminary series of MD simulations to show the effect of charged bases on DNA dynamics.
The Structure of Duplex DNA N HN H O CH 3 N Adenine N H N N O 2 H-bonds N Thymine N O H NH N N N HN H H N O N 3 H-bonds Guanine Cytosine
Higher Order DNA Structures i ) A triplex DNA structure iii ) Certain quadruplexes are associated with a continuous channel of counterions ii ) Guanine-rich DNA can form folded quadruplex structures
The Flow of Genetic Information Promoter of Transcription Start Codon Coding Region Stop Codon Terminator of Transcription DNA RNA Polymerase Transcription Nascent mrna Splicing Intron (non-coding) Exon (coding) Nucleus Mature mrna trna Translation Protein Ribosome
Importance of Charge Transport in DNA Damage The genome is under continuous chemical attack (generally oxidative) which can result in dangerous mutations. GG and GGG sequences are preferentially oxidised, despite the event occuring remotely in the sequence. Such damage propagation has been observed in intact cell nuclei 1. GGG rich motifs occur disproportionately at the termini of intron regions, ideally positioned to sacrificially protect the coding regions of genes 2. 1. Nunez M. E., Holmquist G. P & Barton J. K. (2001) Biochemistry, 40, 12465-12471. 2. Friedman K. A. & Heller A. (2001) J. Phys. Chem. B, 105, 11859-11865.
Charge Transport in Solution Excite tethered photoxidant Rh(phi) 2 bpy 3+ Vary DNA sequence, add binding proteins etc Photoxidant GGG motif A hole is injected into the DNA, which oxidises a distant GG or GGG Oxidative damage can occur up to 200Å (~60 base pairs) from the site of hole injection The relative charge transport efficiency can be measured by detecting the amount of damage using biochemical methods Williams T. T., Odom D. T & Barton J. K. (2000) J. Am Chem. Soc. 122, 9048-9049
DNA Dynamics and Charge Transport The sequence dependence of charge transport efficiency remains poorly understood. Suggested mechanisms for electron/hole transport include: i) Superexchange (~ 3-4 base pairs) ii) Thermally activated hopping iii) Polaron hopping iv) Conformationally gated hopping through charge transport active domains 1,2 What role is played by the thermal fluctuations of the DNA, and which dynamic timescales are associated with the most important motions? 1. O Neill M. & Barton J. K. (2004) J. Am Chem. Soc. 126, 11471-11483 2. Shao F., O Neil M. & Barton J. K. (2004) Proc. Natl. Acad. Sci. USA 101, 17914-17919
The Importance of Sequence Dependent Structure and Dynamics The structure and flexibility of DNA must be highly sequence dependent since DNA binding proteins must recognise specific binding sites to exert cellular control. Although much work has been done on quantifying sequence dependent structure by X-ray and NMR the sequence dependent dynamics of DNA remains poorly understood. Much of the dynamic behaviour is not accessible theoretically, therefore computer simulation is required.
Sequence Specific Recognition by Proteins The TATA box protein-dna complex Repair of a G-U mismatch Barratt T. E. et al (1999) EMBO 18, 6599
Sequence Dependant Structure Different DNA sequences have subtly different structures. For example ~ a run of AT bases will give the DNA a particularly narrow minor groove ~ this is responsible for the Spine of Hydration The precise position of chemical groups (ie H-bonds) also depends on the DNA sequence. The spine of hydration in A-tract DNA
Changes in Structure Due to DNA Environment Canonical B-form DNA A-form DNA, present in water/ methanol mixtures Left-handed Z-form DNA, present at very high salt
The Hierarchy of Dynamic Timescales Timescale Picosecond Nanosecond Microsecond Type of internal motion. Local oscillations of groups of atoms with amplitudes 0.1 A. Bending and twisting motions of the double chain with amplitudes A=5-7 A. Bending, winding and unwinding of the double helix; opening of base pairs of the DNA. Energy of activation. E=0.6 Kcal/Mol; Source: External thermal reservoir. E=2-5Kcal/Mol: Source: Collisions with hot solvent molecules. E=5-20Kcal/Mol Source: Changing of ph; increasing temperature; action of denaturation agents. Experimental methods. NMR, Raman spectroscopy, X-ray. NMR, Raman spectroscopy fluorescence. NMR, hydrogen exchange. Theoretical Methods. Molecular dynamics; harmonic analysis. Molecular dynamics; harmonic analysis. rod-like model. Theory of helix-coil transition; non-linear mechanics.
The AMBER Forcefield The molecule is considered as a collection of atoms interacting through simple, classical potential energy functions. - Electrostatic Repulsion - - H-Bonds Van Der Waals forces - - Covalent Bonds The simple potential energy function is fitted empirically for each specific interaction through as series of constants ~ the AMBER force field parameters. - U + Dihedrals Atoms Total = Angles V 2 Kr ( r req ) 2 ( θ θ ) + Bonds K R ε ij r [ 1+ cos( ηφ γ ] ij ij θ Partial Charges 12 q q i εr ij j eq R r ij ij 6 2 + + Bonds Angles Dihedrals Van der Waals Electrostatics
Contents of the Simulation Cell The simulation cell contains: i) The DNA. ii) Sufficient Na+ counterions to neutralise the system. iii) Enough water molecules to surround the DNA.
A Molecular Dynamics Simulation of DNA Obtain the positions of all atoms in the system over timescales ~2fs to 50ns The most accurate simulations include water and counterions explicitly (~700 solute and ~3000 solvent atoms) and use PME to calculate long range electrostatics
The Structure of DNA in a Vacuum MD simulations of DNA in the gas phase based on electrospray data show that the DNA does not remain in its B-form configuration. In vacuo DNA structures after 100ns of MD Rueda M. et al (2003) J. Am Chem. Soc. 125, 8007-8014
Principal Component Analysis (PCA) Calculate the 3N 3N covariance matrix from the trajectory. Indicates how individual atomic motions were correlated during the simulation. C p, q M 1 = M m= 1 ( X X )( X X ) m, p p m, q q Diagonalise the covariance matrix to find the set of 3N eigenvectors and their corresponding eigenvalues. Find the types of overall structural deformation that were independent during the trajectory - called components or modes. C = u 1 λu Order the components in terms of their eigenvalues. The component with the highest eigenvalue has contributed the most to the system s dynamics.
Principal Component 1 The components with large eigenvalues are large, scale, quasiharmonic oscillations of the entire helix Tyically, the 1 st,2 nd and 3 rd components contribution ~ 60% of the dynamics of the system
The Dynamics of d(ggtaattacc) 2 The DNA helix has very simple mechanical properties which are sequence dependant. Bend at TA step 1 Bend at TA Step 2 Helix Twisting However, it can be difficult to obtain a quantitative comparison of flexibility between simulations using PCA due to anharmonic effects.
MD Simulations of Charged Bases Caution: Preliminary Results!! Perform 5ns classical MD on d(gaaaaaaaac) including i) neutral, ii) positive and iii) negatively charged thymine base. (placed at position 15 based, partial charges calculated using HF and RESP fitting). Neutral thymine nucleotide Positive thymine nucleotide Negative thymine nucleotide Does the presence of a charged base affect the dynamics of the DNA relative to the neutral system?
The Configurational Entropy The entropy of a classical harmonic oscillator: S = 3N 6 1 k 2 i= 1 ln 2 x i A formula is required which gives identical results for large eigenvalues, but which also gives: S 0 as x 2 i 0 This is true for a function of the form: S = 3N 6 1 k 2 i= 1 ln 1 + kte 2 2 mx i 2 The Schlitter Formula. Schlitter J. (1993) Chem. Phys. Lett. 217, No. 6, 617.
T S (kcal/mol) Entropy Convergence An quantitative comparison of the flexibility of each sequence can be obtained by calculating the entropy from MD/PCA. The entropy contains a hidden dependence on time due to the finite length of the trajectory. The presence of a singly 650 600 550 500 450 Neutral Positive Negative 0 1000 2000 3000 4000 Length of Sampling Window (ps) charged base slightly increases the overall flexibility of the helix No simple dependence on key structural parameters (such as H-bond distances) has yet been detected
Future Work Use these simple MD simulations to investigate whether the local or global dynamic modes are influenced by the presence of the charged base. Correctly optimise the geometry of these charged bases using DFT and perform equivalent simulations for comparison 1. Construct semi-empirical QM/MM models of DNA including a charged base using results from these classical calculations to optimise the system. 1. Smith D. M. A. & Adamowicz L. (2001) J. Phys. Chem. 105, 9345-9354
Concluding Remarks DNA structure and dynamics is exquisitely dependent on both the sequence and the environment. DNA dynamics consists of high frequency, low amplitude local modes over ps timescales combined with global quasiharmonic oscillations over ns timescales. The flexibility of DNA is slightly increased in the presence of a charged base, which may be important in constructing models of transport processes.
Other Useful References A general discussion of nucleic acid structure: Nucleic Acid Structure and Recogition, Steve Neidle, OUP. www.oup.co.uk/molbiol2/na-structure/ References which discuss the counterion distribution around DNA: Exploring the Counterion Atmosphere around DNA: What can be learned from Molecular Dynamics simulations? Rueda et al (2004) Biophys. J. 87, 800-811. DNA and its Counterions: A Molecular Dynamics Study. Varnai P. & Zakrzewska K. (2004) Nucl. Acid Res. 32 4269-4280