Nucleotide substitution models

Size: px
Start display at page:

Download "Nucleotide substitution models"

Transcription

1 Nucleotide substitution models Alexander Churbanov University of Wyoming, Laramie Nucleotide substitution models p. 1/23

2 Jukes and Cantor s model [1] The simples symmetrical model of DNA evolution All sites change independently All sites have the same stochastic process working at them Nucleotide substitution models p. 2/23

3 Probability for nucleotide (1) Let us assume nucleotide residing at certain cite in DNA sequence is A, Consider probability that p A(t) that the site will be occupied by A at time t, Since we start with A, p A(0) = 1, At a time 1 probability of still having A is p A(1) = 1 3α. Nucleotide substitution models p. 3/23

4 Probability for nucleotide (2) At a time 2 p A(2) = (1 3α)p A(1) + α(1 p A(1) ) 1. The nucleotide has remained unchanged with probability 1 3α 2. The nucleotide did change to T, C, G, but subsequently reverted to A with probability α The following recurrence holds p A(t+1) = (1 3α)p A(t) + α(1 p A(t) ), p A(t+1) p A(t) = 3αp A(t) + α(1 p A(t) ), p A(t) = 3αp A(t) + α(1 p A(t) ) = 4αp A(t) + α. Nucleotide substitution models p. 4/23

5 Probability for nucleotide (2) At a time 2 p A(2) = (1 3α)p A(1) + α(1 p A(1) ) 1. The nucleotide has remained unchanged with probability 1 3α 2. The nucleotide did change to T, C, G, but subsequently reverted to A with probability α The following recurrence holds p A(t+1) = (1 3α)p A(t) + α(1 p A(t) ), p A(t+1) p A(t) = 3αp A(t) + α(1 p A(t) ), p A(t) = 3αp A(t) + α(1 p A(t) ) = 4αp A(t) + α. Nucleotide substitution models p. 5/23

6 Continuous time dp A(t) dt = 4αp A(t) + α. This first-order linear differential equation has solution p A(t) = 1 ( 4 + p A(0) 1 ) e 4αt 4 Initial probability is p A(0) = 1, thererefore p A(t) = e 4αt Nucleotide substitution models p. 6/23

7 Probabilities If the initial nucleotide is not A, then p A(0) = 0 and p A(t) = e 4αt generalizing for nucleotides i and j, where i j p ii(t) = e 4αt p ij(t) = e 4αt Nucleotide substitution models p. 7/23

8 Graphical interpretation Nucleotide substitution models p. 8/23

9 Sequence similarity (1) Nucleotide substitution models p. 9/23

10 Sequence similarity (2) A common measure for sequence similarity is the proportion of identical nucleotides between the two sequences under study. The expected value of this proportion is equal to the probability I (t) that the nucleotide at a given site at a time t is the same in both sequences. Cases include nucleotide conservation p 2 ii(t) and parallel substitutions p 2 ij(t). I (t) = p 2 AA(t) + p2 AT(t) + p2 AC(t) + p2 AG(t), I (t) = e 8αt. Nucleotide substitution models p. 10/23

11 Estimating substitutions (1) The probability that the two sequences are different at a site at time t is p = 1 I (t) p = 3 4 ( 1 e 8αt ), 8αt = ln (1 43 p ). Nucleotide substitution models p. 11/23

12 Estimating substitutions (2) We estimate K, the actual number of substitutions per site since the divergence between the two sequences. In the one parameter model, K = 2(3αt), where 3αt is the expected number of substitutions per site in one lineage. ( ) 3 K = ln (1 43 ) 4 p Nucleotide substitution models p. 12/23

13 Kimura model [2] The method has the merit of incorporating the possibility that sometimes transition type substitutions (with rate α) may occur more frequently than transversion type substitutions (with rate β). Nucleotide substitution models p. 13/23

14 Kimura model [2] Same UU CC AA GG Total (Frequency) (R 1 ) (R 2 ) (R 3 ) (R 4 ) (R) Different, Type I UC CU AG GA Total (Frequency) (P 1 ) (P 1 ) (P 2 ) (P 2 ) (P ) Different, UA AU UG GU TypeII (Q 1 ) (Q 1 ) (Q 2 ) (Q 2 ) Total CA AC CG GC (Q) (Frequency) (Q 3 ) (Q 3 ) (Q 4 ) (Q 4 ) Nucleotide substitution models p. 14/23

15 Kimura model (1) Total rate of substitutions per site per year is k = α + 2β P is the probability of homologous sites showing a type I difference Q is the probability of homologous sites showing a type II difference R is the probability of homologous sites to be the same We denote probability of identity at homologous sites at time T as R(T) = 1 P(T) Q(T) Nucleotide substitution models p. 15/23

16 Kimura model (2) We can derive the equation for P and Q at time T + T in terms of P, Q and R at time T We can distinguish three ways by which UC (U at homologous position of organism 1 corresponding to C in organism 2) at time T + T is derived from various base pairs at time T. 1. Pair UC is derived from UC. Since probability of substitution in short time interval is T is (α + 2β) T. Thus the probability of no change occurring in both homologous sites is [1 (α + 2β) T] 2, so this case contribution is [1 (α + 2β) T] 2 P 1 (T) Nucleotide substitution models p. 16/23

17 Kimura model (3) 2. Pair UC is derived either from UU or from CC with probability α T [R 1 (T) + R 2 (T)] 3. Pair UC could be derived from UA, UG, AC and GC with probability β T[Q 1 (T) + Q 2 (T) + Q 3 (T) + Q 4 (T)] = β T Q(T) 2 Nucleotide substitution models p. 17/23

18 Kimura model (4) Combining contributions coming from different classes resulting in UC and disregarding terms with ( T) 2, we get (1) P 1 (T + T) = [1 (2α + 4β) T] P 1 (T) + α T [R 1 (T) + R 2 (T)] + β T Q(T) 2 Similarly, for the base pair AG we get (2) P 2 (T + T) = [1 (2α + 4β) T] P 2 (T) + α T [R 3 (T) + R 4 (T)] + β T Q(T) 2 Nucleotide substitution models p. 18/23

19 Kimura model (5) Summing equations (1) and (2), and noting P(T) = 2P 1 (T) + 2P 2 (T), we get (3) P(T) T = 2α 4(α + β)p(t) 2(α β)q(t) Q(T) (4) = 4β 8βQ(T) T Converting (3) and (4) to continuous case, we get dp(t) dt = 2α 4(α + β)p(t) 2(α β)q(t) dq(t) dt = 4β 8βQ(T) Nucleotide substitution models p. 19/23

20 Kimura model (5) The solution for these equations satisfy the initial condition P(0) = Q(0) = 0, i.e. no base difference exists at T = 0 P(T) = e 4(α+β)T e 8βT Q(T) = e 8βT Nucleotide substitution models p. 20/23

21 Kimura model (6) It follows that (5) and (6) so that (7) 4(α + β)t = ln(1 2P(T) Q(T)) 8βT = ln(1 2Q(T)) 4αT = ln(1 2P(T) Q(T)) ln(1 2Q(T)) Nucleotide substitution models p. 21/23

22 Kimura model (7) Since evolutionary rate is k = α + 2β, the total number of substitutions per two diverged sequences is K = 2Tk = 2αT + 4βT By omitting index T and following equations (6) and (7) we obtain K = 1 {(1 2 ln 2P Q) } 1 2Q Nucleotide substitution models p. 22/23

23 References [1] T.H. Jukes and C.R. Cantor, Evolution of protein molecules, Mammalian protein metabolism (H.N. Munro, ed.), Academic Press, New York, 1969, pp [2] M. Kimura, A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences, Journal of Molecular Evolution 16 (1980), Nucleotide substitution models p. 23/23

EVOLUTIONARY DISTANCE MODEL BASED ON DIFFERENTIAL EQUATION AND MARKOV PROCESS

EVOLUTIONARY DISTANCE MODEL BASED ON DIFFERENTIAL EQUATION AND MARKOV PROCESS August 0 Vol 4 No 005-0 JATIT & LLS All rights reserved ISSN: 99-8645 wwwjatitorg E-ISSN: 87-95 EVOLUTIONAY DISTANCE MODEL BASED ON DIFFEENTIAL EUATION AND MAKOV OCESS XIAOFENG WANG College of Mathematical

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Molecular Population Genetics

Molecular Population Genetics Molecular Population Genetics The 10 th CJK Bioinformatics Training Course in Jeju, Korea May, 2011 Yoshio Tateno National Institute of Genetics/POSTECH Top 10 species in INSDC (as of April, 2011) CONTENTS

More information

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Computational Biology and Chemistry

Computational Biology and Chemistry Computational Biology and Chemistry 33 (2009) 245 252 Contents lists available at ScienceDirect Computational Biology and Chemistry journal homepage: www.elsevier.com/locate/compbiolchem Research Article

More information

Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of

More information

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

More information

Evolutionary Change in Nucleotide Sequences. Lecture 3

Evolutionary Change in Nucleotide Sequences. Lecture 3 Evolutionary Change in Nucleotide Sequences Lecture 3 1 So far, we described the evolutionary process as a series of gene substitutions in which new alleles, each arising as a mutation ti in a single individual,

More information

What Is Conservation?

What Is Conservation? What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.

More information

Lecture Notes: Markov chains

Lecture Notes: Markov chains Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity

More information

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

More information

Molecular Evolution and Comparative Genomics

Molecular Evolution and Comparative Genomics Molecular Evolution and Comparative Genomics --- the phylogenetic HMM model 10-810, CMB lecture 5---Eric Xing Some important dates in history (billions of years ago) Origin of the universe 15 ±4 Formation

More information

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?

How should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe? How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26 Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and

More information

Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22

Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and

More information

Markov Models & DNA Sequence Evolution

Markov Models & DNA Sequence Evolution 7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under

More information

Trade Patterns, Production networks, and Trade and employment in the Asia-US region

Trade Patterns, Production networks, and Trade and employment in the Asia-US region Trade Patterns, Production networks, and Trade and employment in the Asia-U region atoshi Inomata Institute of Developing Economies ETRO Development of cross-national production linkages, 1985-2005 1985

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics Maximum Likelihood in Phylogenetics June 1, 2009 Smithsonian Workshop on Molecular Evolution Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs, CT Copyright 2009

More information

BIOINFORMATICS TRIAL EXAMINATION MASTERS KT-OR

BIOINFORMATICS TRIAL EXAMINATION MASTERS KT-OR BIOINFORMATICS KT Maastricht University Faculty of Humanities and Science Knowledge Engineering Study TRIAL EXAMINATION MASTERS KT-OR Examiner: R.L. Westra Date: March 30, 2007 Time: 13:30 15:30 Place:

More information

Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood

Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood For: Prof. Partensky Group: Jimin zhu Rama Sharma Sravanthi Polsani Xin Gong Shlomit klopman April. 7. 2003 Table of Contents Introduction...3

More information

Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36

More information

Evolutionary Analysis of Viral Genomes

Evolutionary Analysis of Viral Genomes University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral

More information

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). 1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff

More information

Michael Yaffe Lecture #4 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #4 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #4 Database Searching & Molecular Phylogenetics Michael Yaffe A B C D A B C D (((A,B)C)D) Outline FASTA, Blast searching, Smith-Waterman Psi-Blast Review of enomic DNA structure Substitution

More information

Taming the Beast Workshop

Taming the Beast Workshop Workshop David Rasmussen & arsten Magnus June 27, 2016 1 / 31 Outline of sequence evolution: rate matrices Markov chain model Variable rates amongst different sites: +Γ Implementation in BES2 2 / 31 genotype

More information

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies Inferring Phylogenetic Trees Distance Approaches Representing distances in rooted and unrooted trees The distance approach to phylogenies given: an n n matrix M where M ij is the distance between taxa

More information

Biology 644: Bioinformatics

Biology 644: Bioinformatics A stochastic (probabilistic) model that assumes the Markov property Markov property is satisfied when the conditional probability distribution of future states of the process (conditional on both past

More information

KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging

KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging Method KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging Zhang Zhang 1,2,3#, Jun Li 2#, Xiao-Qian Zhao 2,3, Jun Wang 1,2,4, Gane Ka-Shu Wong 2,4,5, and Jun Yu 1,2,4 * 1

More information

Counting phylogenetic invariants in some simple cases. Joseph Felsenstein. Department of Genetics SK-50. University of Washington

Counting phylogenetic invariants in some simple cases. Joseph Felsenstein. Department of Genetics SK-50. University of Washington Counting phylogenetic invariants in some simple cases Joseph Felsenstein Department of Genetics SK-50 University of Washington Seattle, Washington 98195 Running Headline: Counting Phylogenetic Invariants

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

Crew of25 Men Start Monday On Showboat. Many Permanent Improvements To Be Made;Project Under WPA

Crew of25 Men Start Monday On Showboat. Many Permanent Improvements To Be Made;Project Under WPA U G G G U 2 93 YX Y q 25 3 < : z? 0 (? 8 0 G 936 x z x z? \ 9 7500 00? 5 q 938 27? 60 & 69? 937 q? G x? 937 69 58 } x? 88 G # x 8 > x G 0 G 0 x 8 x 0 U 93 6 ( 2 x : X 7 8 G G G q x U> x 0 > x < x G U 5

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

that a w r o n g has been committed recognize where the responsibility

that a w r o n g has been committed recognize where the responsibility Z- XX q q J U Y ' G G, w, 142 16, U, j J ' B ' B B k - J, 5 6 5:30 7:00, $125;, -, 10:00 $300;, 4:00, 6:00, B C U 000 2:00, J $125; ' :, q C, k w G x q k w w w w q ' 60,, w q, w w k w w - G w z w w w C

More information

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis

More information

Modeling Noise in Genetic Sequences

Modeling Noise in Genetic Sequences Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:

More information

Phylogenetic invariants versus classical phylogenetics

Phylogenetic invariants versus classical phylogenetics Phylogenetic invariants versus classical phylogenetics Marta Casanellas Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya Algebraic

More information

Using algebraic geometry for phylogenetic reconstruction

Using algebraic geometry for phylogenetic reconstruction Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA

More information

Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations

Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations Masatoshi Nei and Li Jin Center for Demographic and Population Genetics, Graduate School of Biomedical Sciences,

More information

7.36/7.91 recitation CB Lecture #4

7.36/7.91 recitation CB Lecture #4 7.36/7.91 recitation 2-19-2014 CB Lecture #4 1 Announcements / Reminders Homework: - PS#1 due Feb. 20th at noon. - Late policy: ½ credit if received within 24 hrs of due date, otherwise no credit - Answer

More information

Bioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre

Bioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre Bioinformatics Scoring Matrices David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Learning Objectives To explain the requirement

More information

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki

Phylogene)cs. IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, Joyce Nzioki Phylogene)cs IMBB 2016 BecA- ILRI Hub, Nairobi May 9 20, 2016 Joyce Nzioki Phylogenetics The study of evolutionary relatedness of organisms. Derived from two Greek words:» Phle/Phylon: Tribe/Race» Genetikos:

More information

Molecular Evolution and Phylogenetic Tree Reconstruction

Molecular Evolution and Phylogenetic Tree Reconstruction 1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length

More information

The wonderful world of RNA informatics

The wonderful world of RNA informatics December 9, 2012 Course Goals Familiarize you with the challenges involved in RNA informatics. Introduce commonly used tools, and provide an intuition for how they work. Give you the background and confidence

More information

Phylogenetic Assumptions

Phylogenetic Assumptions Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by

More information

Review for Exam 2 Solutions

Review for Exam 2 Solutions Review for Exam 2 Solutions Note: All vector spaces are real vector spaces. Definition 4.4 will be provided on the exam as it appears in the textbook.. Determine if the following sets V together with operations

More information

Maximum Likelihood in Phylogenetics

Maximum Likelihood in Phylogenetics Maximum Likelihood in Phylogenetics 26 January 2011 Workshop on Molecular Evolution Český Krumlov, Česká republika Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut,

More information

Variance and Covariances of the Numbers of Synonymous and Nonsynonymous Substitutions per Site

Variance and Covariances of the Numbers of Synonymous and Nonsynonymous Substitutions per Site Variance and Covariances of the Numbers of Synonymous and Nonsynonymous Substitutions per Site Tatsuya Ota and Masatoshi Nei Institute of Molecular Evolutionary Genetics and Department of Biology, The

More information

Phylogenetics. Andreas Bernauer, March 28, Expected number of substitutions using matrix algebra 2

Phylogenetics. Andreas Bernauer, March 28, Expected number of substitutions using matrix algebra 2 Phylogenetics Andreas Bernauer, andreas@carrot.mcb.uconn.edu March 28, 2004 Contents 1 ts:tr rate ratio vs. ts:tr ratio 1 2 Expected number of substitutions using matrix algebra 2 3 Why the GTR model can

More information

arxiv:q-bio/ v1 [q-bio.pe] 27 May 2005

arxiv:q-bio/ v1 [q-bio.pe] 27 May 2005 Maximum Likelihood Jukes-Cantor Triplets: Analytic Solutions arxiv:q-bio/0505054v1 [q-bio.pe] 27 May 2005 Benny Chor Michael D. Hendy Sagi Snir December 21, 2017 Abstract Complex systems of polynomial

More information

Pithy P o i n t s Picked I ' p and Patljr Put By Our P e r i p a tetic Pencil Pusher VOLUME X X X X. Lee Hi^h School Here Friday Ni^ht

Pithy P o i n t s Picked I ' p and Patljr Put By Our P e r i p a tetic Pencil Pusher VOLUME X X X X. Lee Hi^h School Here Friday Ni^ht G G QQ K K Z z U K z q Z 22 x z - z 97 Z x z j K K 33 G - 72 92 33 3% 98 K 924 4 G G K 2 G x G K 2 z K j x x 2 G Z 22 j K K x q j - K 72 G 43-2 2 G G z G - -G G U q - z q - G x) z q 3 26 7 x Zz - G U-

More information

Initial amounts: mol Amounts at equilibrium: mol (5) Initial amounts: x mol Amounts at equilibrium: x mol

Initial amounts: mol Amounts at equilibrium: mol (5) Initial amounts: x mol Amounts at equilibrium: x mol 4. CHEMICAL EQUILIBRIUM n Equilibrium Constants 4.1. 2A Y + 2Z Initial amounts: 4 0 0 mol Amounts at equilibrium: 1 1.5 3.0 mol Concentrations at equilibrium: 1 5 1.5 5 3.0 5 mol dm 3 K c (1.5/5) (3.0/5)2

More information

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline

More information

Molecular evolution 2. Please sit in row K or forward

Molecular evolution 2. Please sit in row K or forward Molecular evolution 2 Please sit in row K or forward RBFD: cat, mouse, parasite Toxoplamsa gondii cyst in a mouse brain http://phenomena.nationalgeographic.com/2013/04/26/mind-bending-parasite-permanently-quells-cat-fear-in-mice/

More information

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building How Molecules Evolve Guest Lecture: Principles and Methods of Systematic Biology 11 November 2013 Chris Simon Approaching phylogenetics from the point of view of the data Understanding how sequences evolve

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Week 5: Distance methods, DNA and protein models

Week 5: Distance methods, DNA and protein models Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03

More information

MULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE

MULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE MULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE Manmeet Kaur 1, Navneet Kaur Bawa 2 1 M-tech research scholar (CSE Dept) ACET, Manawala,Asr 2 Associate Professor (CSE Dept) ACET, Manawala,Asr

More information

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall

More information

Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions

Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions PLGW05 Stochastic Errors vs. Modeling Errors in Distance Based Phylogenetic Reconstructions 1 joint work with Ilan Gronau 2, Shlomo Moran 3, and Irad Yavneh 3 1 2 Dept. of Biological Statistics and Computational

More information

Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/39

More information

Predicting the Evolution of two Genes in the Yeast Saccharomyces Cerevisiae

Predicting the Evolution of two Genes in the Yeast Saccharomyces Cerevisiae Available online at wwwsciencedirectcom Procedia Computer Science 11 (01 ) 4 16 Proceedings of the 3rd International Conference on Computational Systems-Biology and Bioinformatics (CSBio 01) Predicting

More information

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. BIOL 7711 Computational Bioscience Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

More information

Markov Repairable Systems with History-Dependent Up and Down States

Markov Repairable Systems with History-Dependent Up and Down States Markov Repairable Systems with History-Dependent Up and Down States Lirong Cui School of Management & Economics Beijing Institute of Technology Beijing 0008, P.R. China lirongcui@bit.edu.cn Haijun Li Department

More information

Computational Design of New and Recombinant Selenoproteins

Computational Design of New and Recombinant Selenoproteins Computational Design of ew and Recombinant Selenoproteins Rolf Backofen and Friedrich-Schiller-University Jena Institute of Computer Science Chair for Bioinformatics 1 Computational Design of ew and Recombinant

More information

Estimation of evolutionary distances between homologous

Estimation of evolutionary distances between homologous Proc. NatL Acad. Sci. USA Vol. 78, No. 1, pp. 454-458, January 1981 Genetics Estimation of evolutionary distances between homologous nucleotide sequences (molecular evolution/comparison of base sequences/base

More information

Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor. Department of Biology, Arizona State University Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

More information

Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

Sequence Divergence & The Molecular Clock. Sequence Divergence

Sequence Divergence & The Molecular Clock. Sequence Divergence Sequence Divergence & The Molecular Clock Sequence Divergence v simple genetic distance, d = the proportion of sites that differ between two aligned, homologous sequences v given a constant mutation/substitution

More information

Rate Law Summary. Rate Laws vary as a function of time

Rate Law Summary. Rate Laws vary as a function of time Rate Law Summary Measure the instantaneous rate of a reaction: this is a number with units of M/s! Measure the rate of loss of a reactant r... the rate of appearance of a product Repeat the experiment

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Lecture 6 Free energy and its uses

Lecture 6 Free energy and its uses Lecture 6 Free energy and its uses dg = VdP G - G o = PoP VdP G = G o (T) + RT ln P/P o for gases and G = G o (T) + V (P-P o ) for solids and liquids µ = µ o + RT ln P (for one mole) G = G o + RT ln Q

More information

Lie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia

Lie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Today. Doubling time, half life, characteristic time. Exponential behaviour as solution to DE. Nonlinear DE (e.g. y = y (1-y) )

Today. Doubling time, half life, characteristic time. Exponential behaviour as solution to DE. Nonlinear DE (e.g. y = y (1-y) ) Today Bacterial growth example Doubling time, half life, characteristic time Exponential behaviour as solution to DE Linear DE ( y = ky ) Nonlinear DE (e.g. y = y (1-y) ) Qualitative analysis (phase line)

More information

Regulatory Sequence Analysis. Sequence models (Bernoulli and Markov models)

Regulatory Sequence Analysis. Sequence models (Bernoulli and Markov models) Regulatory Sequence Analysis Sequence models (Bernoulli and Markov models) 1 Why do we need random models? Any pattern discovery relies on an underlying model to estimate the random expectation. This model

More information

Introduction to Polymer Physics

Introduction to Polymer Physics Introduction to Polymer Physics Enrico Carlon, KU Leuven, Belgium February-May, 2016 Enrico Carlon, KU Leuven, Belgium Introduction to Polymer Physics February-May, 2016 1 / 28 Polymers in Chemistry and

More information

Predicting RNA Secondary Structure

Predicting RNA Secondary Structure 7.91 / 7.36 / BE.490 Lecture #6 Mar. 11, 2004 Predicting RNA Secondary Structure Chris Burge Review of Markov Models & DNA Evolution CpG Island HMM The Viterbi Algorithm Real World HMMs Markov Models for

More information

The Genetic Code. Section I A 24 codon table and the exponent 2/3 series (ES) Section II Comparisons with simpler numeral series

The Genetic Code. Section I A 24 codon table and the exponent 2/3 series (ES) Section II Comparisons with simpler numeral series The Genetic Code Section I A 24 codon table and the exponent 2/3 series (ES) Section II Comparisons with simpler numeral series Section III Transformations between number-base systems CONTENT Page Introduction

More information

Transition Theory Abbreviated Derivation [ A - B - C] # E o. Reaction Coordinate. [ ] # æ Æ

Transition Theory Abbreviated Derivation [ A - B - C] # E o. Reaction Coordinate. [ ] # æ Æ Transition Theory Abbreviated Derivation A + BC æ Æ AB + C [ A - B - C] # E A BC D E o AB, C Reaction Coordinate A + BC æ æ Æ æ A - B - C [ ] # æ Æ æ A - B + C The rate of reaction is the frequency of

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

5.111 Lecture Summary #18 Wednesday, October 22, 2014

5.111 Lecture Summary #18 Wednesday, October 22, 2014 5.111 Lecture Summary #18 Wednesday, October 22, 2014 Reading for Today: Sections 10.1-10.5, 10.9 (Sections 9.1-9.4 in 4 th ed.) Reading for Lecture #19: Sections 10.9-10.13 (Section 9.4-9.5 in 4 th ed.)

More information

The Genetic code. Section I. A 24 codon table and the exponent 2/3 series (ES) Åsa Wohlin

The Genetic code. Section I. A 24 codon table and the exponent 2/3 series (ES) Åsa Wohlin The Genetic code Section I A 24 codon table and the exponent 2/3 series (ES) Åsa Wohlin www.u5d.net 2015-03-23 CONTENT Page Introduction: A 5-dimensional Numeral Series 3 A background model 6 0. Amino

More information

INVARIANTS STEVEN N. EVANS AND XIAOWEN ZHOU. Abstract. The method of invariants is an approach to the problem of reconstructing

INVARIANTS STEVEN N. EVANS AND XIAOWEN ZHOU. Abstract. The method of invariants is an approach to the problem of reconstructing DIFFERENT TREES HAVE DISTINCT PHLOGENETIC INVARIANTS STEVEN N. EVANS AND XIAOWEN ZHOU Abstract. The method of invariants is an approach to the problem of reconstructing the phylogenetic tree of a collection

More information

Macroeconomics Qualifying Examination

Macroeconomics Qualifying Examination Macroeconomics Qualifying Examination August 2016 Department of Economics UNC Chapel Hill Instructions: This examination consists of 4 questions. Answer all questions. If you believe a question is ambiguously

More information

d (5 cos 2 x) = 10 cos x sin x x x d y = (cos x)(e d (x 2 + 1) 2 d (ln(3x 1)) = (3) (M1)(M1) (C2) Differentiation Practice Answers 1.

d (5 cos 2 x) = 10 cos x sin x x x d y = (cos x)(e d (x 2 + 1) 2 d (ln(3x 1)) = (3) (M1)(M1) (C2) Differentiation Practice Answers 1. . (a) y x ( x) Differentiation Practice Answers dy ( x) ( ) (A)(A) (C) Note: Award (A) for each element, to a maximum of [ marks]. y e sin x d y (cos x)(e sin x ) (A)(A) (C) Note: Award (A) for each element.

More information

More on phase diagram, chemical potential, and mixing

More on phase diagram, chemical potential, and mixing More on phase diagram, chemical potential, and mixing Narayanan Kurur Department of Chemistry IIT Delhi 13 July 2013 Melting point changes with P ( ) Gα P T = V α V > 0 = G α when P Intersection point

More information

Quantitative Model Checking (QMC) - SS12

Quantitative Model Checking (QMC) - SS12 Quantitative Model Checking (QMC) - SS12 Lecture 06 David Spieler Saarland University, Germany June 4, 2012 1 / 34 Deciding Bisimulations 2 / 34 Partition Refinement Algorithm Notation: A partition P over

More information

Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A

Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC

More information

How to construct international inputoutput

How to construct international inputoutput How to construct international inputoutput tables (with the smallest effort) Satoshi Inomata Institute of Developing Economies JETRO OVERVIEW (1) Basic picture of an international input-output table (IIOT)

More information

Lecture 3: Markov chains.

Lecture 3: Markov chains. 1 BIOINFORMATIK II PROBABILITY & STATISTICS Summer semester 2008 The University of Zürich and ETH Zürich Lecture 3: Markov chains. Prof. Andrew Barbour Dr. Nicolas Pétrélis Adapted from a course by Dr.

More information

Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.

Maximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington. Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters - "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf

More information

Special features of phangorn (Version 2.3.1)

Special features of phangorn (Version 2.3.1) Special features of phangorn (Version 2.3.1) Klaus P. Schliep November 1, 2017 Introduction This document illustrates some of the phangorn [4] specialised features which are useful but maybe not as well-known

More information

(2 pts) a. What is the time-dependent Schrödinger Equation for a one-dimensional particle in the potential, V (x)?

(2 pts) a. What is the time-dependent Schrödinger Equation for a one-dimensional particle in the potential, V (x)? Part I: Quantum Mechanics: Principles & Models 1. General Concepts: (2 pts) a. What is the time-dependent Schrödinger Equation for a one-dimensional particle in the potential, V (x)? (4 pts) b. How does

More information

D. Incorrect! That is what a phylogenetic tree intends to depict.

D. Incorrect! That is what a phylogenetic tree intends to depict. Genetics - Problem Drill 24: Evolutionary Genetics No. 1 of 10 1. A phylogenetic tree gives all of the following information except for. (A) DNA sequence homology among species. (B) Protein sequence similarity

More information