Potts Models and Protein Covariation. Allan Haldane Ron Levy Group

Size: px
Start display at page:

Download "Potts Models and Protein Covariation. Allan Haldane Ron Levy Group"

Transcription

1 Temple University Structural Bioinformatics II Potts Models and Protein Covariation Allan Haldane Ron Levy Group

2 Outline The Evolutionary Origins of Protein Sequence Variation Contents of the Pfam database Protein Evolution (tour of concepts & current ideas) Protein Fitness, Marginal Stability, Compensatory mutations Using Protein-sequence-variation profitably Correlated Mutations and Structural Contacts Potts Models: Theory Review of Potts Model Results Protein Families and their Evolution A Structural perspective. Annual Review of Biochemistry 2005 Bridging the physical scales in ev. biology: from protein sequence space to fitness of organisms and populations COSB 2017 Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness COSB 2017

3 Pfam Pfam is a collection of HMMs and MSAs of many protein families Uses HMMs to detect new sequences 16,306 curated protein families (folds) 23 million protein sequences 7.6 billion residues collected from all branches of life

4 Pfam website example Fibronectin type III Seed Matches All fibronectin sequences in the alignment have the same beta sandwich fold. ~65,000 different sequences ~100 residues long How similar do you think the sequences are (% identical aa)?

5 Fibronectin Type III domain MSA Average sequence identity: 19% WebLogo Seq. ID MSA

6 Fibronectin has quite typical diversity

7 The Protein Universe Protein sequence structure mapping is highly degenerate Typically ~30-40% identity per fold, as low as 8-9% (equal to random sequences) Space of all possible sequences is Space of all sequences sharing a common folded structure may be (rough intuition/guess) (Poses a challenge for HMM methods: Below 20% identity, HMMs have difficulty predicting whether sequence share a fold. This is known as the twilight zone of sequence similarity) How much of protein sequence space has been explored by life on Earth? Dryden et al. J Royal Society Interface (2008) Twilight zone of protein sequence alignments. Rost, Protein Eng Des Sel (1999)

8 The Protein Universe Many authors suggest there are only 1000 to 10,000 existing protein folds across all life 16,384 families in Pfam 1086 folds, 3464 families in SCOP (compare to 20,000 protein-genes in human genome) Protein Families and their Evolution A Structural perspective. Annual Review of Biochemistry 2005 Estimating the total number of protein folds. Govindarajan et al. Proteins: Struct. Func. Bioinfo Expanding protein universe and its origin from the Biological Big Bang PNAS 2002 The Protein Folds as Platonic Forms. Denton et al. Journal of Theoretical Biology 2002

9 Evolutionary origins of proteins 4 genomes with multiple variants of the same protein fold: (usually same function) (often different function) New variants/copies are generated by Tree of 512 kinases in the human genome (paralogs) Tree of c-src kinase across species (orthologs) Gene duplication (paralog) Speciation (ortholog)

10 How is sequence diversity generated? Observations More closely related species have fewer differences per (orthologous) protein % difference in Hemoglobin between related species The Neutral Theory of Molecular Evolution. Kimura 1983

11 How is sequence diversity generated? # of substitutions per site Observations (fossil record) substitutions occur at a constant rate Rate constant hypothesis or Molecular Clock (Zuckerland & Pauling 1965) Technical detail (correction for repeated substitutions) p = percent final sequence difference K = number of past substitution events

12 How is sequence diversity generated? Observations Proteins appear to accumulate substitutions at a constant rate, usually about 1 substitution per site per billion years

13 How is sequence diversity generated? Observations Proteins appear to accumulate substitutions at a constant rate, usually about 1 substitution per site per billion years Different proteins have different evolutionary rates

14 How is sequence diversity generated? Observations Proteins appear to accumulate substitutions at a constant rate, usually about 1 substitution per site per billion years Different proteins have different evolutionary rates Different parts of a single protein have different evolutionary rates Why is there variation in evolutionary rate? Causes of evolutionary rate variation among protein sites. Nature Reviews Genetics (2016)

15 Summary Many possible sequences lead to same fold Proteins in a common family/fold accumulate substitutions at a constant rate over time

16 How is sequence diversity generated? Why is there so much variation? How do substitutions happen? Selective pressure (Natural Selection) on protein sequences: function stability (folding) (non-)aggregation... Evolutionary forces acting on proteins: Natural Selection (selective pressures above) Mutation Genetic Drift Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations COSB 2017

17 How is sequence diversity generated? Why is there so much variation? How do substitutions happen? Selective pressure (Natural Selection) on protein sequences: function stability (folding) Many mutations affect stability (non-)aggregation (Functional site is often small % of protein)... Evolutionary forces acting on proteins: Natural Selection (selective pressures above) Mutation Genetic Drift Bridging the physical scales in evolutionary biology: from protein sequence space to fitness of organisms and populations COSB 2017

18 Protein Folding Biochemistry (understanding selective forces in protein evolution) Two-State model of Folding Reality is more complicated: disordered state molten globule state native state (folded) decoy state hyper-stability Folded Unfolded

19 Variations in Stability Distribution of stabilities Different sequences will have (slightly) different stabilities Protein are marginally stable Possible explanations: Hyper-stability is penalized? More unstable sequences/mutations? Stability related to overall fitness Stability varies as proteins evolve Missense meanderings in sequence space: a biophysical view of protein evolution DePristo, Weinreich, Hartl. Nat Rev Genet 2005

20 Compensatory Mutations/Substitutions Eg, a destabilizing substitution is compensated for by a stabilizing substitution Epistasis When the effect of a mutation (eg, on stability) depends on the identity of other residues. destabilizing stabilizing CTL escape and viral fitness in HIV/SIV infection Front. Microbiol 2010

21 Protein Evolution Why do deleterious (stability-reducing) mutations occur? Why aren t proteins optimally stable?? Evolutionary forces acting on proteins: Natural Selection (previous few slides) Mutation Genetic Drift (next 2 slides) Recombination (not discussed) Quick intro to the Wright-Fisher Model & Population Genetics

22 The Wright-Fisher Model (without natural selection) Need to understand how new variants arise at the population level Genetic Drift = fluctuations in allele frequencies. It causes new alleles to fix in the population even without any natural selection. Population of 10 individuals with different (equally fit) genotypes. (asexual) Next generation formed by random sample (with replacement) of previous generation Simulation 1 Cyan genotype has fixed Simulation 2 Time

23 The Wright-Fisher Model Two allele case, with selection Scenario: All individuals in population have the same protein, but one individual mutates Natural Selection modelled by assigning a weight (fitness) to each genotype, and performing a weighted sample to get the next generation. Then mutant s fixation probability = s = selection coefficient N = population size Neutral Say we assign Old genotype has a weight of 1 Mutant individual has a weight of 1+s Deleterious Beneficial (Kimura s fixation probability) Conclusion: Genetic Drift can cause a new mutant to fix even if it s deleterious (s < 0).

24 Mutation-Selection Balance destabilizing stabilizing Mutation: An individual mutates to a new variant Substitution: A mutant genotype appears and fixes in the population Most protein mutations slightly decrease stability (deleterious). A small number of mutations increase stability. Most deleterious mutations do not fix. Mutation-Selection Balance # deleterious substitutions = # beneficial substitutions (population genetics theory can be used to quantitatively understand when/how this balance occurs) (alternative explanation for why proteins are marginally stable) Why are proteins marginally stable? Taverna, Goldstein. Proteins: Structure, Function, and Bioinformatics 2002 Missense meanderings in sequence space: a biophysical view of protein evolution DePristo, Weinreich, Hartl. Nat Rev Genet 2005 Stability effects of mutations and protein evolvability Tokuriki, Tawfik. Current Opinion in Structural Biology 2009 How Protein Stability and New Functions Trade Off Tokuriki, Stricher, Serrano, Tawfik. PLoS Comput Biol 2008

25 Summary Many possible sequences lead to same fold Proteins in a common family/fold accumulate substitutions at a constant rate over time Most substitutions affect protein stability There is a dynamic balance of slightly deleterious (destabilizing) and slightly beneficial (stabilizing) substitutions over time. Marginal stability is maintained. This dynamic balance also involves: Compensatory mutations Epistatic interactions

26 Part II: Potts Models (Using Protein-sequence-variation to study structure) Outline Motivation and Background Parameterizing a Potts Model Applications of Potts models contacts in protein structure Compensatory mutations correlations in MSA columns?

27 Coevolutionary Analysis and Potts Models Correlated Mutations in a MSA imply Structural Interactions Long history (25 years) of Coevolutionary analysis: Detect Correlated positions, then predict contacts Recent Developments: Instead of modeling each residue pair individually, build a correlated statistical model of the MSA: The Potts model The model can be used for more than contact prediction Lövkvist et al, PRE 87, 2013

28 How to measure correlations in an MSA Positions Residue types Bivariate marginal (frequency) Univariate marginal (frequency) (Example: ) Correlations: Observed pairwise frequency Expected pairwise frequency if positions vary independently if the two positions vary independently

29 Pairwise measures of correlations in an MSA Want a correlation score between position-pairs (sum over Different scoring methods in literature (two shown below) All designed to give a score of 0 for independent variation Mutual Information (MI) Can be interpreted as the neg. log likelihood of generating the distribution when sampling from the distribution Statistical Coupling (SCA) Probability of at i excluding sequences with mutation at j Bar means average over all positions )

30 Relationship to contacts Top-ranking MI (and other) scores finds top contact with about 70% true-positive rate, top 50 at 50%. Can this be improved? Protein 3D Structure Computed from Evolutionary Sequence Variation. Marks et al Plos One 2011 Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction Dunn, Wahl, Gloor. Bioinformatics 2008

31 Direct vs Indirect Correlations Problem: MI, SCA, Cij can be though of as local models of the correlation: They look at single pairs at a time However, correlations can be caused by indirect interactions, or correlated networks these local models ignore this Eg: Position 7 interacts with 15, and 15 interacts with 25. Then 7 and 25 will be correlated even though they don t interact. Instead want to make a global model of the correlations, in order to distinguish direct from indirect correlations. Idea: Make a statistical model of the sequence as a whole vs Probability of a pair of residues Probability of a whole sequence

32 Potts Models: Origin & Motivation How to model P(S)? P(S) describes the probability of generating sequce S, where S spans the entire sequence space. Most general form of P(S) is the set of probabilities of all sequences a model with parameters! We can t directly measure P(S) from an MSA (since each sequence appears once). Unlike bivariate marginals, which can be directly measured from an MSA. Solution: Get the least biased distribution subject to constraints: The Maximum Entropy distribution P(S)

33 Maximum Entropy The Maximum Entropy distribution P(S) Maximizing entropy minimizes the amount of prior information built into the distribution Number of model parameters will be equal to number of constraints Entropy of a distribution: In our case, set constraint that P(S) gives the right bivariate marginals (pairwise correlation statistics) Bivariate marginals from P(S): Sum over entire sequence space

34 Maximum Entropy Entropy of a distribution Constraints: Method of Lagrange Multipliers: Lagrange multipliers, one per constraint Maximize by solving for all S

35 Maximum Entropy Solved by: Rearrange to give: (Boltzmann distribution) ( statistical energy ) (normalization) This gives the Potts Model Note: This model is named for its history in physics of magnetic materials, has many other applications

36 Form of the Potts Model Fields (L x q) Couplings ( (L x q)2) L = sequence length (eg 200) Note similarity of fields to PSSMs q = # of residue types (eg 20) A G A A R G I V F A A R A A F A Potts parameters interpretated as energy contributions from each position/pair

37 Form of the Potts Model Sequence landscape Couplings A G A A R G I V F A A R A A F A Potts Energy Fields Statistical Potts Energy Prevalence Statistical Energy Sequence Prevalence Given known values for the fields and couplings: Image: Dill P(S) gives us a probability for any sequence Can compute other statistics E(S) gives us a statistical energy landscape Can model effect of mutations, with epistasis + compensation Coupling values give us info about direct interactions between positions (without indirect interactions)

38 Parameterizing the Model given an MSA Above we found the functional form of the Maximum Entropy distribution, but we did not discuss how to find the values of the parameters This is actually a challenging task. We need to find the set of values which satisfly the constraints on the marginals, but there is no obvious way to do so Non-trivial function of Potts parameters

39 Parameterizing the Model given an MSA A number of different numerical methods and approximations have been developed to find the parameters: Belief Propagation, Susceptibility Propagation Mean Field inference Pseudolikelihood Methods + Conjugate Gradient Descent Cluster Expansion Monte Carlo + Quasi-Newton Optimization This is a computationally intensive task.

40 Parameterizing the Model given an MSA Flavor of the algorithms: Problem can be framed as a Maximum Likelihood inference Define a Likelihood function which has a maximum when the constraints are satisfied. (probability of the MSA according to model) Conjugate Gradient methods, Quasi-Newtons methods: Start with an initial guess for the Parameters Compute local gradient of the Likelihood Take a small step in that direction (update parameters) Repeat

41 Aside: Correction for Phylogeny and Sampling Biases Sequences may be phylogenetically related we may have a biased sample This may give the appearance of correlations even when there are none Eg: wild type: Single mutants Double mutant AAAAAAAAAAAAAAAAAA AAAAABAAAAAAAAAAAA AAAAAAAAAABAAAAAAA AAAAABAAAABAAAAAAA If we oversample dbl mut, overestimate correlation One solution: Weight each sequence by how many sequences are similar to it: (weighted average) weight Effective # of seqs

42 Parameterizing the Model given an MSA Summary of Inference Procedure 1)Obtain an MSA (eg from Pfam) 2)Apply phylogenetic weighting. Need > 1000 effective sequences for precise marginals 3)Compute the bivariate (and univariate) marginals of the data 4)Perform Parameter inference (eg Gradient Descent) given bivariate marginals End up with a set of parameters in a number of ways. which we can use

43 Potts Model Applications Contact Prediction Mutant stability Contact maps Ab-initio Structure Prediction Free Energy (Conformational) Landscapes Fitness landscapes Melting temperature Seq. Prevalence Electrostatic coupling Structure prediction Viral fitness Enzyme fitness Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness Levy, Haldane, Flynn. COSB 2017

44 Application 1: Contact Prediction Want to get an interaction score (like MI or SCA) but using Potts model Want to summarize the coupling values for each position pair (sum/average over ) Frobenius Norm of Couplings: APC Correction: (removes 'background') Direct Information Similar to MI, but will exclude indirect interactions since it is computed using direct couplings (some technical details related to gauges not discussed here)

45 Application 1: Contact Prediction Contact Map from Potts Model Contact Map from PDB structures (Protein-Kinase domain) Can achieve 80% True Positive rate for top 200 contacts.

46 Application 1: Contact Prediction Direct Interactions Non-interacting DI gives many more True Positives (red) than MI Indirect Interactions DI distinguishes direct from indirect interactions, MI does not Identification of direct residue contacts in protein protein interaction by message passing Weigt, White, Szurmant, Hoch, Hwa. PNAS 2009

47 Application 1: Structure Prediction Idea: Use predicted contacts as input to further algorithms: NRM (distance geometry: contact map structure) Go Models (coarse grained MD) Genomics-aided structure prediction Sułkowska, Morcos, Weigt, Hwa, Onuchic. PNAS 2012

48 Application 2: Free Energy and Conformational Landscapes Potts Energy E(S) reflects experimental mutantstability measurements and melting temperatures Mutant stability Melting temperature Biased MD/Go simulations using contacts as bias/constraints can uncover conformational landscape Quantification of the effect of mutations using a global probability model of natural sequence variation Hopf, Ingraham, Poelwijk, Springer, Sander, Marks Oct 2015 Coevolutionary signals across protein lineages help capture multiple protein conformations Morcos, Jana, Hwa, Onuchic. PNAS 2013 Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection Morcos, Schafer, Cheng, Onuchic, Wolynes. PNAS 2014

49 Application 3: Fitness Landscapes Enzyme fitness Viral fitness Potts energy E(S) reflects fitness of sequences and mutants Potts model can describe epistatic effects and compensatory mutations Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1 Molecular Biology and Evolution 2015 The Fitness Landscape of HIV-1 Gag: Advanced Modeling Approaches and Validation of Model Predictions by in vitro Testing PLoS Comput Biol 2014

The Evolutionary Origins of Protein Sequence Variation

The Evolutionary Origins of Protein Sequence Variation Temple University Structural Bioinformatics II The Evolutionary Origins of Protein Sequence Variation Protein Evolution (tour of concepts & current ideas) Protein Fitness, Marginal Stability, Compensatory

More information

Influence of Multiple Sequence Alignment Depth on Potts Statistical Models of Protein Covariation

Influence of Multiple Sequence Alignment Depth on Potts Statistical Models of Protein Covariation Influence of Multiple Sequence Alignment Depth on Potts Statistical Models of Protein Covariation Allan Haldane Center for Biophysics and Computational Biology, Department of Physics, and Institute for

More information

Supplementing information theory with opposite polarity of amino acids for protein contact prediction

Supplementing information theory with opposite polarity of amino acids for protein contact prediction Supplementing information theory with opposite polarity of amino acids for protein contact prediction Yancy Liao 1, Jeremy Selengut 1 1 Department of Computer Science, University of Maryland - College

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

STRUCTURAL BIOINFORMATICS II. Spring 2018

STRUCTURAL BIOINFORMATICS II. Spring 2018 STRUCTURAL BIOINFORMATICS II Spring 2018 Syllabus Course Number - Classification: Chemistry 5412 Class Schedule: Monday 5:30-7:50 PM, SERC Room 456 (4 th floor) Instructors: Ronald Levy, SERC 718 (ronlevy@temple.edu)

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

Quantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12

Quantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12 Quantitative Stability/Flexibility Relationships; Donald J. Jacobs, University of North Carolina at Charlotte Page 1 of 12 The figure shows that the DCM when applied to the helix-coil transition, and solved

More information

Comparative Genomics II

Comparative Genomics II Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods

More information

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Supporting Information

Supporting Information Supporting Information I. INFERRING THE ENERGY FUNCTION We downloaded a multiple sequence alignment (MSA) for the HIV-1 clade B Protease protein from the Los Alamos National Laboratory HIV database (http://www.hiv.lanl.gov).

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Structural biomathematics: an overview of molecular simulations and protein structure prediction

Structural biomathematics: an overview of molecular simulations and protein structure prediction : an overview of molecular simulations and protein structure prediction Figure: Parc de Recerca Biomèdica de Barcelona (PRBB). Contents 1 A Glance at Structural Biology 2 3 1 A Glance at Structural Biology

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding?

Outline. The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Unfolded Folded. What is protein folding? The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation By Jun Shimada and Eugine Shaknovich Bill Hawse Dr. Bahar Elisa Sandvik and Mehrdad Safavian Outline Background on protein

More information

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences

Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic

More information

Biol478/ August

Biol478/ August Biol478/595 29 August # Day Inst. Topic Hwk Reading August 1 M 25 MG Introduction 2 W 27 MG Sequences and Evolution Handouts 3 F 29 MG Sequences and Evolution September M 1 Labor Day 4 W 3 MG Database

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Basic Local Alignment Search Tool

Basic Local Alignment Search Tool Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses

More information

The neutral theory of molecular evolution

The neutral theory of molecular evolution The neutral theory of molecular evolution Introduction I didn t make a big deal of it in what we just went over, but in deriving the Jukes-Cantor equation I used the phrase substitution rate instead of

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Gene and protein evolution

Gene and protein evolution Topic Course Gene and protein evolution Lecture 5 Winter 2016 Department of Molecular Genetics University of Toronto Hue Sun Chan Synergy between the studies of protein biophysics and protein evolution

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences

Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Classical Selection, Balancing Selection, and Neutral Mutations

Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected

More information

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Protein Folding Prof. Eugene Shakhnovich

Protein Folding Prof. Eugene Shakhnovich Protein Folding Eugene Shakhnovich Department of Chemistry and Chemical Biology Harvard University 1 Proteins are folded on various scales As of now we know hundreds of thousands of sequences (Swissprot)

More information

Fitness landscapes and seascapes

Fitness landscapes and seascapes Fitness landscapes and seascapes Michael Lässig Institute for Theoretical Physics University of Cologne Thanks Ville Mustonen: Cross-species analysis of bacterial promoters, Nonequilibrium evolution of

More information

Chapter 22: Descent with Modification 1. BRIEFLY summarize the main points that Darwin made in The Origin of Species.

Chapter 22: Descent with Modification 1. BRIEFLY summarize the main points that Darwin made in The Origin of Species. AP Biology Chapter Packet 7- Evolution Name Chapter 22: Descent with Modification 1. BRIEFLY summarize the main points that Darwin made in The Origin of Species. 2. Define the following terms: a. Natural

More information

THE EVOLUTION OF DUPLICATED GENES CONSIDERING PROTEIN STABILITY CONSTRAINTS

THE EVOLUTION OF DUPLICATED GENES CONSIDERING PROTEIN STABILITY CONSTRAINTS THE EVOLUTION OF DUPLICATED GENES CONSIDERING PROTEIN STABILITY CONSTRAINTS D.M. TAVERNA*, R.M. GOLDSTEIN* *Biophysics Research Division, Department of Chemistry, University of Michigan, Ann Arbor, MI

More information

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants

More information

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants

More information

CSCE555 Bioinformatics. Protein Function Annotation

CSCE555 Bioinformatics. Protein Function Annotation CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Computational Biology From The Perspective Of A Physical Scientist

Computational Biology From The Perspective Of A Physical Scientist Computational Biology From The Perspective Of A Physical Scientist Dr. Arthur Dong PP1@TUM 26 November 2013 Bioinformatics Education Curriculum Math, Physics, Computer Science (Statistics and Programming)

More information

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed.

Many proteins spontaneously refold into native form in vitro with high fidelity and high speed. Macromolecular Processes 20. Protein Folding Composed of 50 500 amino acids linked in 1D sequence by the polypeptide backbone The amino acid physical and chemical properties of the 20 amino acids dictate

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

Population Genetics I. Bio

Population Genetics I. Bio Population Genetics I. Bio5488-2018 Don Conrad dconrad@genetics.wustl.edu Why study population genetics? Functional Inference Demographic inference: History of mankind is written in our DNA. We can learn

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Compartmentalization detection

Compartmentalization detection Compartmentalization detection Selene Zárate Date Viruses and compartmentalization Virus infection may establish itself in a variety of the different organs within the body and can form somewhat separate

More information

arxiv: v1 [q-bio.qm] 7 Aug 2017

arxiv: v1 [q-bio.qm] 7 Aug 2017 HIGHER ORDER EPISTASIS AND FITNESS PEAKS KRISTINA CRONA AND MENGMING LUO arxiv:1708.02063v1 [q-bio.qm] 7 Aug 2017 ABSTRACT. We show that higher order epistasis has a substantial impact on evolutionary

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture

More information

Unit 7: Evolution Guided Reading Questions (80 pts total)

Unit 7: Evolution Guided Reading Questions (80 pts total) AP Biology Biology, Campbell and Reece, 10th Edition Adapted from chapter reading guides originally created by Lynn Miriello Name: Unit 7: Evolution Guided Reading Questions (80 pts total) Chapter 22 Descent

More information

Protein Mistranslation is Unlikely to Ease a Population s Transit across a Fitness Valley. Matt Weisberg May, 2012

Protein Mistranslation is Unlikely to Ease a Population s Transit across a Fitness Valley. Matt Weisberg May, 2012 Protein Mistranslation is Unlikely to Ease a Population s Transit across a Fitness Valley Matt Weisberg May, 2012 Abstract Recent research has shown that protein synthesis errors are much higher than previously

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Haploid & diploid recombination and their evolutionary impact

Haploid & diploid recombination and their evolutionary impact Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

Laying down deep roots: Molecular models of plant hormone signaling towards a detailed understanding of plant biology

Laying down deep roots: Molecular models of plant hormone signaling towards a detailed understanding of plant biology Laying down deep roots: Molecular models of plant hormone signaling towards a detailed understanding of plant biology Alex Moffett Center for Biophysics and Quantitative Biology PI: Diwakar Shukla Department

More information

Sta$s$cal Physics, Inference and Applica$ons to Biology

Sta$s$cal Physics, Inference and Applica$ons to Biology Sta$s$cal Physics, Inference and Applica$ons to Biology Physics Department, Ecole Normale Superieure, Paris, France. Simona Cocco Office:GH301 mail:cocco@lps.ens.fr Deriving Protein Structure and Func$on

More information

Mutational effects and the evolution of new protein functions

Mutational effects and the evolution of new protein functions Mutational effects and the evolution of new protein functions Misha Soskine and Dan S. Tawfik Abstract The divergence of new genes and proteins occurs through mutations that modulate protein function.

More information

Copyright 2000 N. AYDIN. All rights reserved. 1

Copyright 2000 N. AYDIN. All rights reserved. 1 Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment

More information

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection CHAPTER 23 THE EVOLUTIONS OF POPULATIONS Section C: Genetic Variation, the Substrate for Natural Selection 1. Genetic variation occurs within and between populations 2. Mutation and sexual recombination

More information

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona (tgabaldon@crg.es) http://gabaldonlab.crg.es Homology the same organ in different animals under

More information

Multiple Sequence Alignment: HMMs and Other Approaches

Multiple Sequence Alignment: HMMs and Other Approaches Multiple Sequence Alignment: HMMs and Other Approaches Background Readings: Durbin et. al. Section 3.1, Ewens and Grant, Ch4. Wing-Kin Sung, Ch 6 Beerenwinkel N, Siebourg J. Statistics, probability, and

More information

Example of Function Prediction

Example of Function Prediction Find similar genes Example of Function Prediction Suggesting functions of newly identified genes It was known that mutations of NF1 are associated with inherited disease neurofibromatosis 1; but little

More information

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION

THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure

More information

Chapter 7: Covalent Structure of Proteins. Voet & Voet: Pages ,

Chapter 7: Covalent Structure of Proteins. Voet & Voet: Pages , Chapter 7: Covalent Structure of Proteins Voet & Voet: Pages 163-164, 185-194 Slide 1 Structure & Function Function is best understood in terms of structure Four levels of structure that apply to proteins

More information

Genetic Drift in Human Evolution

Genetic Drift in Human Evolution Genetic Drift in Human Evolution (Part 2 of 2) 1 Ecology and Evolutionary Biology Center for Computational Molecular Biology Brown University Outline Introduction to genetic drift Modeling genetic drift

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the

More information

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009

Gene Genealogies Coalescence Theory. Annabelle Haudry Glasgow, July 2009 Gene Genealogies Coalescence Theory Annabelle Haudry Glasgow, July 2009 What could tell a gene genealogy? How much diversity in the population? Has the demographic size of the population changed? How?

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 3110: Bioinformatics. Tertiary Structure Prediction CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite

More information

Module Contact: Dr Doug Yu, BIO Copyright of the University of East Anglia Version 1

Module Contact: Dr Doug Yu, BIO Copyright of the University of East Anglia Version 1 UNIVERSITY OF EAST ANGLIA School of Biological Sciences Main Series UG Examination 2013-2014 EVOLUTIONARY BIOLOGY AND CONSERVATION GENETICS BIO-3C24 Time allowed: 3 hours Answer ALL questions in Section

More information

Session 5: Phylogenomics

Session 5: Phylogenomics Session 5: Phylogenomics B.- Phylogeny based orthology assignment REMINDER: Gene tree reconstruction is divided in three steps: homology search, multiple sequence alignment and model selection plus tree

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family.

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Research Proposal Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Name: Minjal Pancholi Howard University Washington, DC. June 19, 2009 Research

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection Chapter 7 Selection I Selection in Haploids Selection in Diploids Mutation-Selection Balance Darwinian Selection v evolution vs. natural selection? v evolution ² descent with modification ² change in allele

More information

STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization)

STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University kubatko.2@osu.edu

More information

Multiple Whole Genome Alignment

Multiple Whole Genome Alignment Multiple Whole Genome Alignment BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 206 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

Bioinformatics. Genotype -> Phenotype DNA. Jason H. Moore, Ph.D. GECCO 2007 Tutorial / Bioinformatics.

Bioinformatics. Genotype -> Phenotype DNA. Jason H. Moore, Ph.D. GECCO 2007 Tutorial / Bioinformatics. Bioinformatics Jason H. Moore, Ph.D. Frank Lane Research Scholar in Computational Genetics Associate Professor of Genetics Adjunct Associate Professor of Biological Sciences Adjunct Associate Professor

More information

Neutral Theory of Molecular Evolution

Neutral Theory of Molecular Evolution Neutral Theory of Molecular Evolution Kimura Nature (968) 7:64-66 King and Jukes Science (969) 64:788-798 (Non-Darwinian Evolution) Neutral Theory of Molecular Evolution Describes the source of variation

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

ECE521 lecture 4: 19 January Optimization, MLE, regularization

ECE521 lecture 4: 19 January Optimization, MLE, regularization ECE521 lecture 4: 19 January 2017 Optimization, MLE, regularization First four lectures Lectures 1 and 2: Intro to ML Probability review Types of loss functions and algorithms Lecture 3: KNN Convexity

More information

Endowed with an Extra Sense : Mathematics and Evolution

Endowed with an Extra Sense : Mathematics and Evolution Endowed with an Extra Sense : Mathematics and Evolution Todd Parsons Laboratoire de Probabilités et Modèles Aléatoires - Université Pierre et Marie Curie Center for Interdisciplinary Research in Biology

More information

The protein folding problem consists of two parts:

The protein folding problem consists of two parts: Energetics and kinetics of protein folding The protein folding problem consists of two parts: 1)Creating a stable, well-defined structure that is significantly more stable than all other possible structures.

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

Evolution of functionality in lattice proteins

Evolution of functionality in lattice proteins Evolution of functionality in lattice proteins Paul D. Williams,* David D. Pollock, and Richard A. Goldstein* *Department of Chemistry, University of Michigan, Ann Arbor, MI, USA Department of Biological

More information

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION

EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION Friday, July 27th, 11:00 EVOLUTIONARY DYNAMICS AND THE EVOLUTION OF MULTIPLAYER COOPERATION IN A SUBDIVIDED POPULATION Karan Pattni karanp@liverpool.ac.uk University of Liverpool Joint work with Prof.

More information

Evolution and Computation. Christos H. Papadimitriou The Simons Institute

Evolution and Computation. Christos H. Papadimitriou The Simons Institute Evolution and Computation Christos H. Papadimitriou The Simons Institute The Algorithm as a Lens It started with Alan Turing, 60 years ago Algorithmic thinking as a novel and productive point of view for

More information

Gene regulation: From biophysics to evolutionary genetics

Gene regulation: From biophysics to evolutionary genetics Gene regulation: From biophysics to evolutionary genetics Michael Lässig Institute for Theoretical Physics University of Cologne Thanks Ville Mustonen Johannes Berg Stana Willmann Curt Callan (Princeton)

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)

More information

Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology. Evolutionary Biology

Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology. Evolutionary Biology Why EvoSysBio? Combine the rigor from two powerful quantitative modeling traditions: Molecular Systems Biology rigorous models of molecules... in organisms Modeling Evolutionary Biology rigorous models

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

(Write your name on every page. One point will be deducted for every page without your name!)

(Write your name on every page. One point will be deducted for every page without your name!) POPULATION GENETICS AND MICROEVOLUTIONARY THEORY FINAL EXAMINATION (Write your name on every page. One point will be deducted for every page without your name!) 1. Briefly define (5 points each): a) Average

More information

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution

Wright-Fisher Models, Approximations, and Minimum Increments of Evolution Wright-Fisher Models, Approximations, and Minimum Increments of Evolution William H. Press The University of Texas at Austin January 10, 2011 1 Introduction Wright-Fisher models [1] are idealized models

More information

BIOINFORMATICS: An Introduction

BIOINFORMATICS: An Introduction BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and

More information

Population Genetics: a tutorial

Population Genetics: a tutorial : a tutorial Institute for Science and Technology Austria ThRaSh 2014 provides the basic mathematical foundation of evolutionary theory allows a better understanding of experiments allows the development

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein

More information