The Evolutionary Origins of Protein Sequence Variation
|
|
- Marilynn Holmes
- 5 years ago
- Views:
Transcription
1 Temple University Structural Bioinformatics II The Evolutionary Origins of Protein Sequence Variation Protein Evolution (tour of concepts & current ideas) Protein Fitness, Marginal Stability, Compensatory mutations Using Protein-sequence-variation profitably Correlated Mutations and Structural Contacts Potts Models: Theory Review of Potts Model Results Allan Haldane Ron Levy Group
2 Summary (from last time) New versions of a protein arise by gene duplication Many possible sequences lead to same fold Proteins in a common family/fold accumulate substitutions at a constant rate over time (usually same function) (often different function) Fibronectin Type III domain MSA Average sequence identity: 19% WebLogo # of substitutions per site (fossil record) substitutions occur at a constant rate Rate constant hypothesis or Molecular Clock (Zuckerland & Pauling 1965) Seq ID MSA
3 How is sequence diversity generated? Why is there so much variation? How do substitutions happen? Evolutionary forces acting on proteins: Natural Selection Mutation Genetic Drift Selective forces on proteins Proteins must: Carry out a function (enzymes, etc) be stable (fold) not aggregate... Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations COSB 2017
4 How is sequence diversity generated? Why is there so much variation? How do substitutions happen? The protein s sequence determines whether it folds, reacts, or aggregates. How can we quantify this relationship? Selective forces on proteins Proteins must: Carry out a function (enzymes, etc) be stable (fold) Many mutations afect stability not aggregate (Functional site is often only a small % of protein).... Selective pressure to fold is one of the predominant protein selective forces. Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations COSB 2017
5 Protein Folding Biochemistry (understanding selective forces in protein evolution) Proteins need to Fold before they can carry out their function Mutations can cause a protein not to fold. Two-State model of Folding The folding process depends on the Free energy of folding, which is determined by the interactions among the amino acids in the folded conformation (sequence dependent) We can model the folding process using simple thermodynamics
6 Lattice Models Some intuition for the free energy of folding through lattice proteins Idea Represent a protein as covalently-bonded chain of amino acids on a 3d grid (eg, 27 amino acids on a 3x3 grid) The chain cannot self-intersect, so only certain conformations are possible. Using a computer we can enumerate all conformations. There are 103,346 for a 3x3 grid. Neighboring amino acids in the grid (which are not bonded along the chain) interact. We defne a matrix of interaction free energies for every amino-acid pair (eg A-A, A-C, A-R etc. Miyazawa Jernigan potential). If we add up the interaction energies for all neighbor residues for a particular sequence, in a particular conformation, we get the folding energy of that sequence+conformation.
7 Lattice Models Some intuition for the free energy of folding through lattice proteins We can calculate the folding energy of any sequence in any conformation. Constraint: Only one conformation is functional. Given a sequence, we can compute its folding energy for the folded conformation, and compare to the folding energy of unfolded/other conformations. In thermodynamic equilibrium, the probability for that sequence to be in the folded state is Folded Unfolded
8 Lattice Models Some intuition for the free energy of folding through lattice proteins Key point: The folding energy depends on the amino acid sequence, in an intuitive way. Reality is more complicated: disordered state molten globule state native state (folded) decoy state Hyper-stability Lattice models can be used to understand some of these efects too: Model with decoy states:
9 Lattice Models Further analysis of lattice models can give quantitative insights into protein evolution, in an elegant theory known as the Random Energy Model. Result: Poorly-folding sequences greatly outnumber strongly folding sequences: It s easy to fnd mutations which decrease folding probability, hard to fnd those that increase it. Can make a histogram of folding energy for all (random) sequnces Can make histogram of energy of all conformations for a single sequence Can make histogram of energies of evolved sequences Simulations show the degree of sequence variation we should expect for folded sequences (qualitatively matches observations). More careful analysis of the energy gap between the folded conformations and decoy conformations gives insights into constraints on sequence evolution
10 Variations in Stability Distribution of stabilities Different observed sequences have (slightly) different stabilities Observation: Protein are marginally stable Possible explanations: Hyper-stability is penalized? Greater number of stable sequences/mutations? Stability varies as proteins evolve Missense meanderings in sequence space: a biophysical view of protein evolution DePristo, Weinreich, Hartl. Nat Rev
11 Compensatory Mutations/Substitutions Eg, a destabilizing substitution is compensated for by a stabilizing substitution Epistasis When the effect of a mutation (eg, on stability) depends on the identity of other residues. destabilizing stabilizing CTL escape and viral fitness in HIV/SIV infection Front. Microbiol 2010
12 Protein Evolution Why do deleterious (stability-reducing) mutations occur? Why aren t proteins optimally stable?? Evolutionary forces acting on proteins: Natural Selection (previous few slides) Mutation Genetic Drift (next 2 slides) Recombination (not discussed) Quick intro to the Wright-Fisher Model & Population Genetics
13 The Wright-Fisher Model (without natural selection) Need to understand how new variants arise at the population level Genetic Drift = fluctuations in allele frequencies. It causes new alleles to fix in the population even without any natural selection. Population of 10 individuals with diferent (equally ft) genotypes. (asexual) Next generation formed by random sample (with replacement) of previous generation Simulation 1 Cyan genotype has fixed Simulation 2 Time
14 The Wright-Fisher Model Two allele case, with selection Scenario: All individuals in population have the same protein, but one individual mutates Natural Selection modelled by assigning a weight (ftness) to each genotype, and performing a weighted sample to get the next generation. Then mutant s fixation probability = s = selection coefcient N = population size Neutral Say we assign Old genotype has a weight of 1 Mutant individual has a weight of 1+s Deleterious Beneficial (Kimura s fixation probability) Conclusion: Genetic Drift can cause a new mutant to fix even if it s deleterious (s < 0).
15 Mutation-Selection Balance stabilizing destabilizing X X X Vocabulary: Mutation: An individual mutates to a new variant Substitution: A mutant genotype appears and fxes in the population Most protein mutations slightly decrease stability (deleterious). Most mutations do not fix, though some do. A small number of mutations increase stability (beneficial). These mutations often fix.
16 Mutation-Selection Balance Selection Bias Mutational Bias X X X Mutation-Selection Balance # deleterious substitutions = # beneficial substitutions Population genetics theory can be used to quantitatively understand when/how this balance occurs: (under certain conditions) (alternative explanation for why proteins are marginally stable) Why are proteins marginally stable? Taverna, Goldstein. Proteins: Structure, Function, and Bioinformatics 2002 Missense meanderings in sequence space: a biophysical view of protein evolution DePristo, Weinreich, Hartl. Nat Rev Genet 2005 Stability effects of mutations and protein evolvability Tokuriki, Tawfik. Current Opinion in Structural Biology 2009 How Protein Stability and New Functions Trade Off Tokuriki, Stricher, Serrano, Tawfik. PLoS Comput Biol 2008
17 Summary Many possible sequences lead to same fold Proteins in a common family/fold accumulate substitutions at a constant rate over time Most substitutions affect protein stability There is a dynamic balance of slightly deleterious (destabilizing) and slightly beneficial (stabilizing) substitutions over time. Marginal stability is maintained. This dynamic balance also involves: Compensatory mutations Epistatic interactions
18 Part II: Potts Models (Using Protein-sequence-variation to study structure) Outline Motivation and Background Parameterizing a Potts Model Applications of Potts models contacts in protein structure Compensatory mutations correlations in MSA columns?
19 Coevolutionary Analysis and Potts Models Correlated Mutations in a MSA imply Structural Interactions Long history (25 years) of Coevolutionary analysis: Detect Correlated positions, then predict contacts Recent Developments: Instead of modeling each residue pair individually, build a correlated statistical model of the MSA: The Potts model The model can be used for more than contact prediction Lövkvist et al, PRE 87, 2013
20 How to measure correlations in an MSA Positions Residue types Bivariate marginal (frequency) Univariate marginal (frequency) (Example: ) Correlations: Observed pairwise frequency Expected pairwise frequency if positions vary independently if the two positions vary independently
21 Pairwise measures of correlations in an MSA Want a correlation score between position-pairs (sum over Different scoring methods in literature (two shown below) All designed to give a score of 0 for independent variation Mutual Information (MI) Can be interpreted as the neg. log likelihood of generating the distribution when sampling from the distribution Statistical Coupling (SCA) Probability of at i excluding sequences with mutation at j Bar means average over all positions )
22 Relationship to contacts Top-ranking MI (and other) scores finds top contact with about 70% true-positive rate, top 50 at 50%. Can this be improved? Protein 3D Structure Computed from Evolutionary Sequence Variation. Marks et al Plos One 2011 Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction Dunn, Wahl, Gloor. Bioinformatics 2008
23 Direct vs Indirect Correlations Problem: MI, SCA, Cij can be though of as local models of the correlation: They look at single pairs at a time However, correlations can be caused by indirect interactions, or correlated networks these local models ignore this Eg: Position 7 interacts with 15, and 15 interacts with 25. Then 7 and 25 will be correlated even though they don t interact. Instead want to make a global model of the correlations, in order to distinguish direct from indirect correlations. Idea: Make a statistical model of the sequence as a whole vs Probability of a pair of residues Probability of a whole sequence
24 Potts Models: Origin & Motivation How to model P(S)? P(S) describes the probability of generating sequce S, where S spans the entire sequence space. Most general form of P(S) is the set of probabilities of all sequences a model with parameters! We can t directly measure P(S) from an MSA (since each sequence appears once). Unlike bivariate marginals, which can be directly measured from an MSA. Solution: Get the least biased distribution subject to constraints: The Maximum Entropy distribution P(S)
25 Maximum Entropy The Maximum Entropy distribution P(S) Maximizing entropy minimizes the amount of prior information built into the distribution Number of model parameters will be equal to number of constraints Entropy of a distribution: In our case, set constraint that P(S) gives the right bivariate marginals (pairwise correlation statistics) Bivariate marginals from P(S): Sum over entire sequence space
26 Maximum Entropy Entropy of a distribution Constraints: Method of Lagrange Multipliers: Lagrange multipliers, one per constraint Maximize by solving for all S
27 Maximum Entropy Solved by: Rearrange to give: (Boltzmann distribution) ( statistical energy ) (normalization) This gives the Potts Model Note: This model is named for its history in physics of magnetic materials, has many other applications
28 Form of the Potts Model Fields (L x q) Couplings ( (L x q)2) L = sequence length (eg 200) Note similarity of felds to PSSMs q = # of residue types (eg 20) A G A A R G I V F A A R A A F A Potts parameters interpretated as energy contributions from each position/pair
29 Form of the Potts Model Potts Energy Sequence landscape Couplings A Prevalence G A A R G I V F Potts Statistical Energy A A R A A F A Potts probability Given known values for the fields and couplings: Image: Dill P(S) gives us a probability for any sequence Note similarity to lattice models E(S) gives us a statistical energy landscape Can model effect of mutations, with epistasis + compensation Coupling values give us info about direct interactions between positions (without indirect interactions)
30 Parameterizing the Model given an MSA Above we found the functional form of the Maximum Entropy distribution, but we did not discuss how to find the values of the parameters This is actually a challenging task. We need to find the set of values which satisfly the constraints on the marginals, but there is no obvious way to do so Non-trivial function of Potts parameters
31 Parameterizing the Model given an MSA A number of different numerical methods and approximations have been developed to find the parameters: Belief Propagation, Susceptibility Propagation Mean Field inference Pseudolikelihood Methods + Conjugate Gradient Descent Cluster Expansion Monte Carlo + Quasi-Newton Optimization This is a computationally intensive task.
32 Parameterizing the Model given an MSA Flavor of the algorithms: Problem can be framed as a Maximum Likelihood inference Define a Likelihood function which has a maximum when the constraints are satisfied. (probability of the MSA according to model) Conjugate Gradient methods, Quasi-Newtons methods: Start with an initial guess for the Parameters Compute local gradient of the Likelihood Take a small step in that direction (update parameters) Repeat
33 Aside: Correction for Phylogeny and Sampling Biases Sequences may be phylogenetically related we may have a biased sample This may give the appearance of correlations even when there are none Eg: wild type: Single mutants Double mutant AAAAAAAAAAAAAAAAAA AAAAABAAAAAAAAAAAA AAAAAAAAAABAAAAAAA AAAAABAAAABAAAAAAA If we oversample dbl mut, overestimate correlation One solution: Weight each sequence by how many sequences are similar to it: (weighted average) weight Effective # of seqs
34 Parameterizing the Model given an MSA Summary of Inference Procedure 1)Obtain an MSA (eg from Pfam) 2)Apply phylogenetic weighting. Need > 1000 effective sequences for precise marginals 3)Compute the bivariate (and univariate) marginals of the data 4)Perform Parameter inference (eg Gradient Descent) given bivariate marginals End up with a set of parameters in a number of ways. which we can use
35 Potts Model Applications Contact Prediction Mutant stability Contact maps Ab-initio Structure Prediction Free Energy (Conformational) Landscapes Fitness landscapes Melting temperature Seq. Prevalence Electrostatic coupling Structure prediction Viral ftness Enzyme ftness Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary ftness Levy, Haldane, Flynn. COSB 2017
36 Application 1: Contact Prediction Want to get an interaction score (like MI or SCA) but using Potts model Want to summarize the coupling values for each position pair (sum/average over ) Frobenius Norm of Couplings: APC Correction: (removes 'background') Direct Information Similar to MI, but will exclude indirect interactions since it is computed using direct couplings (some technical details related to gauges not discussed here)
37 Application 1: Contact Prediction Contact Map from Potts Model Contact Map from PDB structures (Protein-Kinase domain) Can achieve 80% True Positive rate for top 200 contacts.
38 Application 1: Contact Prediction Direct Interactions Non-interacting DI gives many more True Positives (red) than MI Indirect Interactions DI distinguishes direct from indirect interactions, MI does not Identification of direct residue contacts in protein protein interaction by message passing Weigt, White, Szurmant, Hoch, Hwa. PNAS 2009
39 Application 1: Structure Prediction Idea: Use predicted contacts as input to further algorithms: NRM (distance geometry: contact map structure) Go Models (coarse grained MD) Genomics-aided structure prediction Sułkowska, Morcos, Weigt, Hwa, Onuchic. PNAS 2012
40 Application 2: Free Energy and Conformational Landscapes Potts Energy E(S) refects experimental mutant-stability measurements and melting temperatures Mutant stability Melting temperature Biased MD/Go simulations using contacts as bias/constraints can uncover conformational landscape Quantifcation of the efect of mutations using a global probability model of natural sequence variation Hopf, Ingraham, Poelwijk, Springer, Sander, Coevolutionary Marks Oct 2015signals across protein lineages help capture multiple protein conformations Morcos, Jana, Hwa, Onuchic. PNAS 2013 Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection Morcos, Schafer, Cheng, Onuchic, Wolynes. PNAS
41 Application 2: Free Energy and Conformational Landscapes By only adding up the couplings corresponding to particular conformational changes, can even predict the conformational preferences of individual sequences. Predicted prefences match up with experimental measures of conformational penalty.
42 Application 3: Fitness Landscapes Enzyme ftness Viral ftness Potts energy E(S) refects ftness of sequences and mutants Potts model can describe epistatic efects and compensatory mutations Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1 Molecular Biology and Evolution 2015 The Fitness Landscape of HIV-1 Gag: Advanced Modeling Approaches and Validation of Model Predictions by in vitro Testing PLoS Comput Biol 2014
43 Summary Potts models can be inferred from the patterns of evolutionary covariation of residues observed in an MSA. Many applications in predicting structures and ftnesses of protein families and of individial sequences: Can predict sequence-dependence of protein folding.
Potts Models and Protein Covariation. Allan Haldane Ron Levy Group
Temple University Structural Bioinformatics II Potts Models and Protein Covariation Allan Haldane Ron Levy Group Outline The Evolutionary Origins of Protein Sequence Variation Contents of the Pfam database
More informationInfluence of Multiple Sequence Alignment Depth on Potts Statistical Models of Protein Covariation
Influence of Multiple Sequence Alignment Depth on Potts Statistical Models of Protein Covariation Allan Haldane Center for Biophysics and Computational Biology, Department of Physics, and Institute for
More informationSupplementing information theory with opposite polarity of amino acids for protein contact prediction
Supplementing information theory with opposite polarity of amino acids for protein contact prediction Yancy Liao 1, Jeremy Selengut 1 1 Department of Computer Science, University of Maryland - College
More informationTHE EVOLUTION OF DUPLICATED GENES CONSIDERING PROTEIN STABILITY CONSTRAINTS
THE EVOLUTION OF DUPLICATED GENES CONSIDERING PROTEIN STABILITY CONSTRAINTS D.M. TAVERNA*, R.M. GOLDSTEIN* *Biophysics Research Division, Department of Chemistry, University of Michigan, Ann Arbor, MI
More informationQuantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12
Quantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12 The figure shows that the DCM when applied to the helix-coil transition, and solved
More informationUnderstanding relationship between homologous sequences
Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective
More informationSupporting Information
Supporting Information I. INFERRING THE ENERGY FUNCTION We downloaded a multiple sequence alignment (MSA) for the HIV-1 clade B Protease protein from the Los Alamos National Laboratory HIV database (http://www.hiv.lanl.gov).
More informationTHE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION
THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure
More informationProtein Folding Prof. Eugene Shakhnovich
Protein Folding Eugene Shakhnovich Department of Chemistry and Chemical Biology Harvard University 1 Proteins are folded on various scales As of now we know hundreds of thousands of sequences (Swissprot)
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationStructural biomathematics: an overview of molecular simulations and protein structure prediction
: an overview of molecular simulations and protein structure prediction Figure: Parc de Recerca Biomèdica de Barcelona (PRBB). Contents 1 A Glance at Structural Biology 2 3 1 A Glance at Structural Biology
More informationOutline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?
The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation By Jun Shimada and Eugine Shaknovich Bill Hawse Dr. Bahar Elisa Sandvik and Mehrdad Safavian Outline Background on protein
More informationProtein Mistranslation is Unlikely to Ease a Population s Transit across a Fitness Valley. Matt Weisberg May, 2012
Protein Mistranslation is Unlikely to Ease a Population s Transit across a Fitness Valley Matt Weisberg May, 2012 Abstract Recent research has shown that protein synthesis errors are much higher than previously
More informationEvolution of functionality in lattice proteins
Evolution of functionality in lattice proteins Paul D. Williams,* David D. Pollock, and Richard A. Goldstein* *Department of Chemistry, University of Michigan, Ann Arbor, MI, USA Department of Biological
More informationMajor questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.
Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally
More informationLecture Notes: BIOL2007 Molecular Evolution
Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits
More informationarxiv: v1 [cond-mat.soft] 22 Oct 2007
Conformational Transitions of Heteropolymers arxiv:0710.4095v1 [cond-mat.soft] 22 Oct 2007 Michael Bachmann and Wolfhard Janke Institut für Theoretische Physik, Universität Leipzig, Augustusplatz 10/11,
More informationGene and protein evolution
Topic Course Gene and protein evolution Lecture 5 Winter 2016 Department of Molecular Genetics University of Toronto Hue Sun Chan Synergy between the studies of protein biophysics and protein evolution
More informationProcesses of Evolution
15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection
More informationNeutral Theory of Molecular Evolution
Neutral Theory of Molecular Evolution Kimura Nature (968) 7:64-66 King and Jukes Science (969) 64:788-798 (Non-Darwinian Evolution) Neutral Theory of Molecular Evolution Describes the source of variation
More informationMolecular Evolution & the Origin of Variation
Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants
More informationMolecular Evolution & the Origin of Variation
Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants
More informationComputational methods for predicting protein-protein interactions
Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational
More informationSTRUCTURAL BIOINFORMATICS II. Spring 2018
STRUCTURAL BIOINFORMATICS II Spring 2018 Syllabus Course Number - Classification: Chemistry 5412 Class Schedule: Monday 5:30-7:50 PM, SERC Room 456 (4 th floor) Instructors: Ronald Levy, SERC 718 (ronlevy@temple.edu)
More informationDistance Constraint Model; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 11
Distance Constraint Model; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 11 Taking the advice of Lord Kelvin, the Father of Thermodynamics, I describe the protein molecule and other
More informationFitness landscapes and seascapes
Fitness landscapes and seascapes Michael Lässig Institute for Theoretical Physics University of Cologne Thanks Ville Mustonen: Cross-species analysis of bacterial promoters, Nonequilibrium evolution of
More informationMany proteins spontaneously refold into native form in vitro with high fidelity and high speed.
Macromolecular Processes 20. Protein Folding Composed of 50 500 amino acids linked in 1D sequence by the polypeptide backbone The amino acid physical and chemical properties of the 20 amino acids dictate
More informationCOMP598: Advanced Computational Biology Methods and Research
COMP598: Advanced Computational Biology Methods and Research Modeling the evolution of RNA in the sequence/structure network Jerome Waldispuhl School of Computer Science, McGill RNA world In prebiotic
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More informationHomology Modeling. Roberto Lins EPFL - summer semester 2005
Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,
More informationBiotechnology of Proteins. The Source of Stability in Proteins (III) Fall 2015
Biotechnology of Proteins The Source of Stability in Proteins (III) Fall 2015 Conformational Entropy of Unfolding It is The factor that makes the greatest contribution to stabilization of the unfolded
More informationPopulation Genetics: a tutorial
: a tutorial Institute for Science and Technology Austria ThRaSh 2014 provides the basic mathematical foundation of evolutionary theory allows a better understanding of experiments allows the development
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationEndowed with an Extra Sense : Mathematics and Evolution
Endowed with an Extra Sense : Mathematics and Evolution Todd Parsons Laboratoire de Probabilités et Modèles Aléatoires - Université Pierre et Marie Curie Center for Interdisciplinary Research in Biology
More informationBasic Local Alignment Search Tool
Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses
More informationWhy EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology. Evolutionary Biology
Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology rigorous models of molecules... in organisms Modeling Evolutionary Biology rigorous models
More informationClassical Selection, Balancing Selection, and Neutral Mutations
Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected
More informationGene regulation: From biophysics to evolutionary genetics
Gene regulation: From biophysics to evolutionary genetics Michael Lässig Institute for Theoretical Physics University of Cologne Thanks Ville Mustonen Johannes Berg Stana Willmann Curt Callan (Princeton)
More informationEVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION
Friday, July 27th, 11:00 EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION Karan Pattni karanp@liverpool.ac.uk University of Liverpool Joint work with Prof.
More informationThe neutral theory of molecular evolution
The neutral theory of molecular evolution Introduction I didn t make a big deal of it in what we just went over, but in deriving the Jukes-Cantor equation I used the phrase substitution rate instead of
More informationUnfolding CspB by means of biased molecular dynamics
Chapter 4 Unfolding CspB by means of biased molecular dynamics 4.1 Introduction Understanding the mechanism of protein folding has been a major challenge for the last twenty years, as pointed out in the
More informationThe protein folding problem consists of two parts:
Energetics and kinetics of protein folding The protein folding problem consists of two parts: 1)Creating a stable, well-defined structure that is significantly more stable than all other possible structures.
More informationWright-Fisher Models, Approximations, and Minimum Increments of Evolution
Wright-Fisher Models, Approximations, and Minimum Increments of Evolution William H. Press The University of Texas at Austin January 10, 2011 1 Introduction Wright-Fisher models [1] are idealized models
More informationarxiv:cond-mat/ v1 [cond-mat.soft] 19 Mar 2001
Modeling two-state cooperativity in protein folding Ke Fan, Jun Wang, and Wei Wang arxiv:cond-mat/0103385v1 [cond-mat.soft] 19 Mar 2001 National Laboratory of Solid State Microstructure and Department
More informationarxiv: v1 [q-bio.qm] 7 Aug 2017
HIGHER ORDER EPISTASIS AND FITNESS PEAKS KRISTINA CRONA AND MENGMING LUO arxiv:1708.02063v1 [q-bio.qm] 7 Aug 2017 ABSTRACT. We show that higher order epistasis has a substantial impact on evolutionary
More informationQ1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.
OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationCHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection
CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section C: Genetic Variation, the Substrate for Natural Selection 1. Genetic variation occurs within and between populations 2. Mutation and sexual recombination
More informationMutational effects and the evolution of new protein functions
Mutational effects and the evolution of new protein functions Misha Soskine and Dan S. Tawfik Abstract The divergence of new genes and proteins occurs through mutations that modulate protein function.
More informationCompartmentalization detection
Compartmentalization detection Selene Zárate Date Viruses and compartmentalization Virus infection may establish itself in a variety of the different organs within the body and can form somewhat separate
More informationCopyright 2000 N. AYDIN. All rights reserved. 1
Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationSystems Biology: A Personal View IX. Landscapes. Sitabhra Sinha IMSc Chennai
Systems Biology: A Personal View IX. Landscapes Sitabhra Sinha IMSc Chennai Fitness Landscapes Sewall Wright pioneered the description of how genotype or phenotypic fitness are related in terms of a fitness
More informationIntroduction to" Protein Structure
Introduction to" Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Learning Objectives Outline the basic levels of protein structure.
More informationBustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #
Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either
More informationInterplay between pleiotropy and secondary selection determines rise and fall of mutators in stress response.
Interplay between pleiotropy and secondary selection determines rise and fall of mutators in stress response. Muyoung Heo and Eugene I. Shakhnovich Department of Chemistry and Chemical Biology, Harvard
More informationHaploid & diploid recombination and their evolutionary impact
Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis
More informationMonte Carlo Simulations of Protein Folding using Lattice Models
Monte Carlo Simulations of Protein Folding using Lattice Models Ryan Cheng 1,2 and Kenneth Jordan 1,3 1 Bioengineering and Bioinformatics Summer Institute, Department of Computational Biology, University
More informationQuantifying sequence similarity
Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationFebuary 1 st, 2010 Bioe 109 Winter 2010 Lecture 11 Molecular evolution. Classical vs. balanced views of genome structure
Febuary 1 st, 2010 Bioe 109 Winter 2010 Lecture 11 Molecular evolution Classical vs. balanced views of genome structure - the proposal of the neutral theory by Kimura in 1968 led to the so-called neutralist-selectionist
More informationMolecular Mechanics. I. Quantum mechanical treatment of molecular systems
Molecular Mechanics I. Quantum mechanical treatment of molecular systems The first principle approach for describing the properties of molecules, including proteins, involves quantum mechanics. For example,
More informationShort Announcements. 1 st Quiz today: 15 minutes. Homework 3: Due next Wednesday.
Short Announcements 1 st Quiz today: 15 minutes Homework 3: Due next Wednesday. Next Lecture, on Visualizing Molecular Dynamics (VMD) by Klaus Schulten Today s Lecture: Protein Folding, Misfolding, Aggregation
More informationProteins polymer molecules, folded in complex structures. Konstantin Popov Department of Biochemistry and Biophysics
Proteins polymer molecules, folded in complex structures Konstantin Popov Department of Biochemistry and Biophysics Outline General aspects of polymer theory Size and persistent length of ideal linear
More informationSta$s$cal Physics, Inference and Applica$ons to Biology
Sta$s$cal Physics, Inference and Applica$ons to Biology Physics Department, Ecole Normale Superieure, Paris, France. Simona Cocco Office:GH301 mail:cocco@lps.ens.fr Deriving Protein Structure and Func$on
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationThere are 3 parts to this exam. Use your time efficiently and be sure to put your name on the top of each page.
EVOLUTIONARY BIOLOGY EXAM #1 Fall 2017 There are 3 parts to this exam. Use your time efficiently and be sure to put your name on the top of each page. Part I. True (T) or False (F) (2 points each). Circle
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationLecture 18 Generalized Belief Propagation and Free Energy Approximations
Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and
More informationIntroduction to Computational Structural Biology
Introduction to Computational Structural Biology Part I 1. Introduction The disciplinary character of Computational Structural Biology The mathematical background required and the topics covered Bibliography
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationProtein Structure. W. M. Grogan, Ph.D. OBJECTIVES
Protein Structure W. M. Grogan, Ph.D. OBJECTIVES 1. Describe the structure and characteristic properties of typical proteins. 2. List and describe the four levels of structure found in proteins. 3. Relate
More informationPopulation Genetics I. Bio
Population Genetics I. Bio5488-2018 Don Conrad dconrad@genetics.wustl.edu Why study population genetics? Functional Inference Demographic inference: History of mankind is written in our DNA. We can learn
More informationWarm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab
Date: Agenda Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab Ask questions based on 5.1 and 5.2 Quiz on 5.1 and 5.2 How
More informationQTL model selection: key players
Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:
More informationCHAPTERS 24-25: Evidence for Evolution and Phylogeny
CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology
More informationSupporting Information
Supporting Information Weghorn and Lässig 10.1073/pnas.1210887110 SI Text Null Distributions of Nucleosome Affinity and of Regulatory Site Content. Our inference of selection is based on a comparison of
More informationInvestigation of physiochemical interactions in
Investigation of physiochemical interactions in Bulk and interfacial water Aqueous salt solutions (new endeavor) Polypeptides exhibiting a helix-coil transition Aqueous globular proteins Protein-solvent
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More information7. Tests for selection
Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More informationComparative Genomics II
Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationCopyright Mark Brandt, Ph.D A third method, cryogenic electron microscopy has seen increasing use over the past few years.
Structure Determination and Sequence Analysis The vast majority of the experimentally determined three-dimensional protein structures have been solved by one of two methods: X-ray diffraction and Nuclear
More informationBiology Tutorial. Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan
Biology Tutorial Aarti Balasubramani Anusha Bharadwaj Massa Shoura Stefan Giovan Viruses A T4 bacteriophage injecting DNA into a cell. Influenza A virus Electron micrograph of HIV. Cone-shaped cores are
More informationGenetic Drift in Human Evolution
Genetic Drift in Human Evolution (Part 2 of 2) 1 Ecology and Evolutionary Biology Center for Computational Molecular Biology Brown University Outline Introduction to genetic drift Modeling genetic drift
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationSupporting Online Material for
www.sciencemag.org/cgi/content/full/309/5742/1868/dc1 Supporting Online Material for Toward High-Resolution de Novo Structure Prediction for Small Proteins Philip Bradley, Kira M. S. Misura, David Baker*
More informationMutation Selection on the Metabolic Pathway and the Effects on Protein Co-evolution and the Rate Limiting Steps on the Tree of Life
Ursinus College Digital Commons @ Ursinus College Mathematics Summer Fellows Student Research 7-21-2016 Mutation Selection on the Metabolic Pathway and the Effects on Protein Co-evolution and the Rate
More informationLaying down deep roots: Molecular models of plant hormone signaling towards a detailed understanding of plant biology
Laying down deep roots: Molecular models of plant hormone signaling towards a detailed understanding of plant biology Alex Moffett Center for Biophysics and Quantitative Biology PI: Diwakar Shukla Department
More informationConnections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables. Revised submission to IEEE TNN
Connections between score matching, contrastive divergence, and pseudolikelihood for continuous-valued variables Revised submission to IEEE TNN Aapo Hyvärinen Dept of Computer Science and HIIT University
More informationSTRUCTURAL BIOINFORMATICS. Barry Grant University of Michigan
STRUCTURAL BIOINFORMATICS Barry Grant University of Michigan www.thegrantlab.org bjgrant@umich.edu Bergen, Norway 28-Sep-2015 Objective: Provide an introduction to the practice of structural bioinformatics,
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationEvolving plastic responses to external and genetic environments
Evolving plastic responses to external and genetic environments M. Reuter, M. F. Camus, M. S. Hill, F. Ruzicka and K. Fowler Research Department of Genetics, Evolution and Environment, University College
More informationProtein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror
Protein structure prediction CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror 1 Outline Why predict protein structure? Can we use (pure) physics-based methods? Knowledge-based methods Two major
More informationDihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769
Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC The precise definition of a dihedral or torsion angle can be found in spatial geometry Angle between to planes Dihedral
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary
More information