Taming the Beast Workshop
|
|
- Dylan Thompson
- 6 years ago
- Views:
Transcription
1 Workshop David Rasmussen & arsten Magnus June 27, / 31
2 Outline of sequence evolution: rate matrices Markov chain model Variable rates amongst different sites: +Γ Implementation in BES2 2 / 31
3 genotype sequence level UGGUGUUG UGGUUUG phenotype e.g. antigenic level: ntibody binding to HIV codon: three nucleotides encode for one amino acid one nucleotide change can already change the phenotype alphabet: 4 nucleotides: DN: G RN: UG 20 amino acids 3 / 31
4 genotype sequence level UGGUGUUG UGGUUUG phenotype e.g. antigenic level: ntibody binding to HIV codon: three nucleotides encode for one amino acid one nucleotide change can already change the phenotype alphabet: 4 nucleotides: DN: G RN: UG 20 amino acids When comparing two nucleotide sequences we have to keep in mind that they are the result of mutation during replication (genotypic level) and selection (phenotypic level). 3 / 31
5 G G way of arranging sequences to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences o find an alignment: concept of positional homology: nucleotides (or amino acids) show positional homology if they exist at equivalent positions in the respective sequence. Programs for alignment MUSLE, LUSL which can be called from e.g. liview, Meglign,... BES analysis starts with aligned sequences!!! file format.fas,.fasta,.nexus 4 / 31
6 for nucleotide substitions 5 / 31
7 he fundamental problem G G G G taxon 1 G G G taxon 2 G G taxon 3 6 / 31
8 he fundamental problem G G single substitution > G G G taxon 1 G G G taxon 2 G G taxon 3 6 / 31
9 he fundamental problem G G multiple substitutions > > G G taxon 1 G G G taxon 2 G G taxon 3 6 / 31
10 he fundamental problem G G convergent substitution > G G taxon 1 > G G G taxon 2 G G taxon 3 6 / 31
11 he fundamental problem G G G G G G > > > G G taxon 1 G G taxon 1 G G > G > G G G taxon 2 G G G taxon 2 G G G G G taxon 3 G G taxon 3 G G Problem of phylogenetics: We observe sequences but not their evolutionary history. hus we have to take all possible evolutionary trajectories into account. 6 / 31
12 he fundamental problem G G G G G G G G taxon 1 > > G G taxon 1 > G G > G G G G taxon 2 G G G taxon 2 > G G G G G taxon 3 G G taxon 3 G G Problem of phylogenetics: We observe sequences but not their evolutionary history. hus we have to take all possible evolutionary trajectories into account. he sequence evolution model appears in the posterior: P( )=P( )P( )P( )P( )P( )... G G P( )... G... 6 / 31
13 model for nucleotide substitutions State space of each nucleotide position: S = {,,, G} Example: ssume the process is at state -(a+b+c) G a b c G 7 / 31
14 model for nucleotide substitutions State space of each nucleotide position: S = {,,, G} Example: ssume the process is at state -(a+b+c) G a b c G Substitution rate matrix: G -(a+b+c) a b c d -(d+e+f) e f g h -(g+h+i) i G j k l -(j+k+l) 7 / 31
15 Site models in BES2 8 / 31
16 he easiest substitution model: J69 J69: named after H Jukes, R antor: Evolution of protein molecules [Jukes and antor, 1969]. all substitution have the same rate, λ G Substitution rates: G λ λ λ λ λ λ λ λ λ G λ λ λ 9 / 31
17 ccounting for transition/transversion: K80 K80: named after M Kimura: simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences [Kimura, 1980] transitions happen at rate α, transversions at rate β pyrimidines (one ring) purines (two rings) transversion transition G Substitution rates: G α β β α β β β β α G β β α 10 / 31
18 ccounting for transition/transversion: HKY HKY: named after [Hasegawa et al., 1984, Hasegawa et al., 1985] accounting for transitions (rate α), transversions (rate β) after a long period of evolution, equilibrium frequencies are reached pyrimidines (one ring) purines (two rings) transversion transition G Substitution rates: G απ βπ βπ G απ βπ βπ G βπ βπ απ G G βπ βπ απ α β β π = α β β β β α 0 π π 0 β β α π G 11 / 31
19 ccounting for transition/transversion: N93 N93: named after [amura and Nei, 1993] accounting for different transition rates between and as well as and G after a long period of evolution, equilibrium frequencies are reached pyrimidines (one ring) purines (two rings) transversion α 1 α 2 transition G Substitution rates: G α 1 π βπ βπ G α 1 π βπ βπ G βπ βπ α 2 π G G βπ βπ α 2 π 12 / 31
20 more general substitution model: GR GR (REV): generalised time-reversible model based on three papers: [avaré, 1986, Yang, 1994, Zharkikh, 1994] Substitution rates: G aπ bπ cπ G aπ dπ eπ G bπ dπ fπ G G cπ eπ fπ + quite flexible + time-reversible - not completely general 13 / 31
21 he most general substitution model implemented in BES2 but not in BEUti UNRES: unrestricted model first described in [Yang, 1994] each substitution has a (different) rate Substitution rates: G a b c d e f g h i G j k l + most general case + all other models are special cases of UNRES - mathematical very complicated and not handy to use - not time-reversible 14 / 31
22 in BEUti model parameters description J69 1 all substitutions have the same rate K accounts for transition and transversions, not in BEUti HKY 2+3 distinction between transition and transversions, including equilibrium frequencies N different rates for transitions GR 6+3 general, but still time-reversible UNRES 12 most general, not time-reversible, not in BEUti an be empirically estimated from the alignment or inferred alongside the substitution rates. 15 / 31
23 he fundamental problem - again G G G G taxon 1 G G G taxon 2 G G taxon 3 Problem of phylogenetics: We observe sequences but not their evolutionary history. hus we have to take all possible evolutionary trajectories into account. 16 / 31
24 he fundamental problem - again G G G G taxon 1 G G G taxon 2 G G taxon 3 Problem of phylogenetics: We observe sequences but not their evolutionary history. hus we have to take all possible evolutionary trajectories into account. So far we determined rates of nucleotide substitutions. But we need probabilities. 16 / 31
25 Nucleotide substitutions as (M) Definition of a Markov chain (see also [Ross, 1996]) stochastic process, i.e. a series of random experiments through time Nucleotide substitutions as M G p G p G p G time 17 / 31
26 Nucleotide substitutions as (M) Definition of a Markov chain (see also [Ross, 1996]) stochastic process, i.e. a series of random experiments through time Nucleotide substitutions as M G p G p G p G time lives on a state space and jumps to the different states p p G 17 / 31
27 Nucleotide substitutions as (M) Definition of a Markov chain (see also [Ross, 1996]) stochastic process, i.e. a series of random experiments through time Nucleotide substitutions as M G p G p G p G time lives on a state space and jumps to the different states p p G memorylessness: the probability of jumping to a state only depends on the actual state G p G p G p G time 17 / 31
28 Why are a great model for nucleotide substitutions memorylessness: a nucleotides substitution happens independently from the substitution history at this site substitution rate matrix defines the transition probabilities applying theories of linear algebra we can calculate the transition probability matrix according to: P(t) = e Qt = U diag(e ɛ 1t, e ɛ 2t, e ɛ 3t, e ɛ 4t )U 1 the transition probabilities take into account every possible substitution path (hapman-kolmogorov theorem) 18 / 31
29 Example of transition probabilities: J69 Substitution rates: 3λ λ λ λ Q = λ 3λ λ λ λ λ 3λ λ λ λ λ 3λ P(t) = e Qt G transition probability matrix: p 0 (t) p 1 (t) p 1 (t) p 1 (t) P(t) = p 1 (t) p 0 (t) p 1 (t) p 1 (t) p 1 (t) p 1 (t) p 0 (t) p 1 (t) p 1 (t) p 1 (t) p 1 (t) p 0 (t) with p 0 (t) = e 4λt and p 1 (t) = e 4λt 19 / 31
30 Example of transition probabilities: J69 Substitution rates: 3λ λ λ λ Q = λ 3λ λ λ λ λ 3λ λ λ λ λ 3λ P(t) = e Qt G transition probability matrix: p 0 (t) p 1 (t) p 1 (t) p 1 (t) P(t) = p 1 (t) p 0 (t) p 1 (t) p 1 (t) p 1 (t) p 1 (t) p 0 (t) p 1 (t) p 1 (t) p 1 (t) p 1 (t) p 0 (t) with p 0 (t) = e 4λt and p 1 (t) = e 4λt substitutions per site λ = day transistion probabilities p 0 (t) p 1 (t) time in days 19 / 31
31 J69: Stationary distribution Suppose we have a sequence that evolves with rate 9 substitutions per site λ = 2.2/3 10 year. We follow the evolution of 4 different sites with at site 1, at site 2, at site 3 and G at site 4 at time point 0. How likely is it, that after time t has passed, there is a,, or G at the four different positions? o answer this question, we follow the time evolution of the transition probability matrix P(t): x10 8 9x x time/years when t stationary distribution is reached ny long sequence (e.g....) at time 0, will be composed of equal amounts of,,,g after time t 20 / 31
32 J69: ime transformation he times we look at, e.g. in species evolution, are very often very large. hus, instead of real time, we display an evolutionary time scale in terms of sequence distances. s one substitution happens at rate 3λ in J69 (keep in mind that in other models the expected time to substitution is different!), we expect one substitution to happen after time 1/(3λ). his is due to exponentially distributed waiting times for an event happening at a certain rate. his means, that we expect one substitution after years in our example x10 8 9x x time/years time in years expected time to 1 substitution t = 3λ d in J69 rick from physics: compare units: [t] =years [ d 3λ ] = # substitutions # substitutions/year d=timex(3 λ) 21 / 31
33 22 / 31
34 Variable rates so far: all sites in the sequence evolve at the same rate but: substitution rates might differ over the genome mutation rates might differ over sites selective pressure might be different on the phenotypic level 23 / 31
35 Variable rates so far: all sites in the sequence evolve at the same rate but: substitution rates might differ over the genome mutation rates might differ over sites selective pressure might be different on the phenotypic level We extend the existing models, by replacing the constant rates by Γ-distributed random variables (notation: J69+Γ, HKY+Γ,... ) 23 / 31
36 Example: J69+Γ λ λr we replace the substitution rate λ by λr, where R is a Γ-distributed random variable with shape parameter α and mean 1. g(r) r α=0.2 α=1 α=2 α= / 31
37 Example: J69+Γ λ λr we replace the substitution rate λ by λr, where R is a Γ-distributed random variable with shape parameter α and mean 1. g(r) r α= In BEUti: hange number of Gamma ategory ount to allow for rate variation. 4 to 6 categories work normally well. 24 / 31
38 25 / 31
39 he codon sun codon consists of three nucleotides, translating to one of the 20 amino acids: hree-letter One-Letter Molecular mino cid bbreviation Symbol Weight lanine la 89Da rginine rg R 174Da sparagine sn N 132Da sparticacid sp D 133Da sparagineor asparticacid sx B 133Da ysteine ys 121Da Glutamine Gln Q 146Da Glutamicacid Glu E 147Da Glutamineor glutamicacid Glx Z 147Da Glycine Gly G 75Da Histidine His H 155Da Isoleucine Ile I 131Da Leucine Leu L 131Da Lysine Lys K 146Da Methionine Met M 149Da Phenylalanine Phe F 165Da Proline Pro P 115Da Serine Ser S 105Da hreonine hr 119Da ryptophan rp W 204Da yrosine yr Y 181Da Valine Val V 117Da [Sanger, 2015] [Promega, 2015] 26 / 31
40 Example: odon Overview over substitution rates to the same codon, the thickness of arrows represent different rates: (Ile) G (Val) (Leu) G (rg) (Leu) (Leu) (Gln) (Leu) G (Leu) (Pro) synonymous substitutions: does not change nonsynonymous substitutions: does change bigger arrows: transition smaller arrows: transversion adapted from [Yang, 2014] 27 / 31
41 Varying substitution rates amongst the codon positions [Bofkin and Goldman, 2007] have shown that in protein encoding regions second codon positions evolve more slowly than first codon positions third codon positions evolve faster than first codon positions 28 / 31
42 Varying substitution rates amongst the codon positions [Bofkin and Goldman, 2007] have shown that in protein encoding regions second codon positions evolve more slowly than first codon positions third codon positions evolve faster than first codon positions Different codon positions can have different evolutionary rates. BES2 allows for estimating these rates separately. file BES2.4.x/examples/nexus/primate-mtDN.nex 28 / 31
43 Including the choice of substitution rate model into your BES analysis 29 / 31
44 Rate models in BES2 BES2 allows for including different site models into your analysis ( Site Model tab in BEUti) Which site model is the best for your data? 30 / 31
45 Rate models in BES2 BES2 allows for including different site models into your analysis ( Site Model tab in BEUti) Which site model is the best for your data? : package bmodelest: Bayesian site model selection for nucleotide data 30 / 31
46 Rate models in BES2 BES2 allows for including different site models into your analysis ( Site Model tab in BEUti) Which site model is the best for your data? : package bmodelest: Bayesian site model selection for nucleotide data : package SubstBM: modelling across-site variation in the nucleotide 30 / 31
47 I - Bofkin, L. and Goldman, N. (2007). Variation in Evolutionary Processes at Different odon Positions. Molecular Biology and Evolution, 24(2): Hasegawa, M., Kishino, H., and Yano,. (1985). Dating of the Human pe Splitting by a Molecular lock of Mitochondrial-Dna. Journal of, 22(2): Hasegawa, M., Yano,., and Kishino, H. (1984). New Molecular lock of Mitochondrial-Dna and the Evolution of Hominoids. Proceedings of the Japan cademy Series B-Physical and Biological Sciences, 60(4): Jukes,. and antor,. (1969). Evolution of protein molecules. Mammalian Protein Metabolism., pages Kimura, M. (1980). simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of molecular evolution, 16(2): Promega (2015). he amino acids: /media/files/resources/technical references/amino acid abbreviations and molecular weights.pdf. - Ross, S. M. (1996). Stochastic Processes. Second edition. Wiley. - Sanger (2015). he codon sun: ftp://ftp.sanger.ac.uk/pub/yourgenome/downloads/activities/kras-cancer-mutation/krascodonwheel.pdf. - amura, K. and Nei, M. (1993). Estimation of the number of nucleotide substitutions in the control region of mitochondrial DN in humans and chimpanzees. Molecular Biology and Evolution, 10(3): avaré, S. (1986). Some probabilistic and statistical problems in the analysis of DN sequences. In Some mathematical questions in biology DN sequence analysis (New York, 1984), pages mer. Math. Soc., Providence, RI. - Yang, Z. (1994). Estimating the pattern of nucleotide substitution. Journal of molecular evolution, 39(1): Yang, Z. (2014). Statistical pproach. Oxford University Press. - Zharkikh,. (1994). Estimation of evolutionary distances between nucleotide sequences. Journal of molecular evolution, 39(3): / 31
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and
More informationLecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22
Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and
More informationLecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationStudent Handout 2. Human Sepiapterin Reductase mrna Gene Map A 3DMD BioInformatics Activity. Genome Sequencing. Sepiapterin Reductase
Project-Based Learning ctivity Human Sepiapterin Reductase mrn ene Map 3DMD BioInformatics ctivity 498 ---+---------+--------- ---------+---------+---------+---------+---------+---------+---------+---------+---------+---------
More informationProteins: Characteristics and Properties of Amino Acids
SBI4U:Biochemistry Macromolecules Eachaminoacidhasatleastoneamineandoneacidfunctionalgroupasthe nameimplies.thedifferentpropertiesresultfromvariationsinthestructuresof differentrgroups.thergroupisoftenreferredtoastheaminoacidsidechain.
More information, where we have X4 CYTOSINE :NT{C}=NT{X } = [ ].{4;XXXX} = [10 ].4;TGCA
15.5 DETERMINTION OF NT OF DN MINO ID In page 70, if we include in the utilized concept of Boolean rithmetical Field (BFi) of the previous item, the nucleotide THYMINE ( T ), instead of URIL { U ) as the
More informationMassachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution
Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/39
More informationLecture 15: Realities of Genome Assembly Protein Sequencing
Lecture 15: Realities of Genome Assembly Protein Sequencing Study Chapter 8.10-8.15 1 Euler s Theorems A graph is balanced if for every vertex the number of incoming edges equals to the number of outgoing
More informationEvolutionary Analysis of Viral Genomes
University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral
More informationMutation models I: basic nucleotide sequence mutation models
Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or
More informationSubstitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A
GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC
More informationUsing Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell
Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell Mathematics and Biochemistry University of Wisconsin - Madison 0 There Are Many Kinds Of Proteins The word protein comes
More informationMaximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018
Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras
More informationViewing and Analyzing Proteins, Ligands and their Complexes 2
2 Viewing and Analyzing Proteins, Ligands and their Complexes 2 Overview Viewing the accessible surface Analyzing the properties of proteins containing thousands of atoms is best accomplished by representing
More informationTranslation. A ribosome, mrna, and trna.
Translation The basic processes of translation are conserved among prokaryotes and eukaryotes. Prokaryotic Translation A ribosome, mrna, and trna. In the initiation of translation in prokaryotes, the Shine-Dalgarno
More informationInferring Molecular Phylogeny
Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction
More informationProtein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods
Cell communication channel Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu SEQUENCE STRUCTURE DNA Sequence Protein Sequence Protein Structure Protein structure ATGAAATTTGGAAACTTCCTTCTCACTTATCAGCCACCT...
More informationSEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA
SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS 1 Prokaryotes and Eukaryotes 2 DNA and RNA 3 4 Double helix structure Codons Codons are triplets of bases from the RNA sequence. Each triplet defines an amino-acid.
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationProperties of amino acids in proteins
Properties of amino acids in proteins one of the primary roles of DNA (but not the only one!) is to code for proteins A typical bacterium builds thousands types of proteins, all from ~20 amino acids repeated
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationExam III. Please read through each question carefully, and make sure you provide all of the requested information.
09-107 onors Chemistry ame Exam III Please read through each question carefully, and make sure you provide all of the requested information. 1. A series of octahedral metal compounds are made from 1 mol
More informationChemistry Chapter 22
hemistry 2100 hapter 22 Proteins Proteins serve many functions, including the following. 1. Structure: ollagen and keratin are the chief constituents of skin, bone, hair, and nails. 2. atalysts: Virtually
More informationHow should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?
How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What
More informationEdward Susko Department of Mathematics and Statistics, Dalhousie University. Introduction. Installation
1 dist est: Estimation of Rates-Across-Sites Distributions in Phylogenetic Subsititution Models Version 1.0 Edward Susko Department of Mathematics and Statistics, Dalhousie University Introduction The
More informationProbabilistic modeling and molecular phylogeny
Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction
More information(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.
1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the
More informationEarly History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species
Schedule Bioinformatics and Computational Biology: History and Biological Background (JH) 0.0 he Parsimony criterion GKN.0 Stochastic Models of Sequence Evolution GKN 7.0 he Likelihood criterion GKN 0.0
More informationIn: M. Salemi and A.-M. Vandamme (eds.). To appear. The. Phylogenetic Handbook. Cambridge University Press, UK.
In: M. Salemi and A.-M. Vandamme (eds.). To appear. The Phylogenetic Handbook. Cambridge University Press, UK. Chapter 4. Nucleotide Substitution Models THEORY Korbinian Strimmer () and Arndt von Haeseler
More information7. Tests for selection
Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info
More informationIn: P. Lemey, M. Salemi and A.-M. Vandamme (eds.). To appear in: The. Chapter 4. Nucleotide Substitution Models
In: P. Lemey, M. Salemi and A.-M. Vandamme (eds.). To appear in: The Phylogenetic Handbook. 2 nd Edition. Cambridge University Press, UK. (final version 21. 9. 2006) Chapter 4. Nucleotide Substitution
More informationRELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG
RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG Department of Biology (Galton Laboratory), University College London, 4 Stephenson
More informationUNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh
UNT TWELVE PROTENS : PEPTDE BONDNG AND POLYPEPTDES 12 CONCEPTS Many proteins are important in biological structure-for example, the keratin of hair, collagen of skin and leather, and fibroin of silk. Other
More informationLie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia
Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory
More informationAmino Acids and Peptides
Amino Acids Amino Acids and Peptides Amino acid a compound that contains both an amino group and a carboxyl group α-amino acid an amino acid in which the amino group is on the carbon adjacent to the carboxyl
More information7.36/7.91 recitation CB Lecture #4
7.36/7.91 recitation 2-19-2014 CB Lecture #4 1 Announcements / Reminders Homework: - PS#1 due Feb. 20th at noon. - Late policy: ½ credit if received within 24 hrs of due date, otherwise no credit - Answer
More informationMS/MS of Peptides Manual Sequencing of Protonated Peptides
S/S of Peptides anual Sequencing of Protonated Peptides Árpád Somogyi Associate irector CCIC, ass Spectrometry and Proteomics Laboratory SU July 11, 2018 Peptides Product Ion Scan Product ion spectra contain
More informationWhat Is Conservation?
What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.
More informationIntroduction to Comparative Protein Modeling. Chapter 4 Part I
Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature
More informationSara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)
Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationPredicting the Evolution of two Genes in the Yeast Saccharomyces Cerevisiae
Available online at wwwsciencedirectcom Procedia Computer Science 11 (01 ) 4 16 Proceedings of the 3rd International Conference on Computational Systems-Biology and Bioinformatics (CSBio 01) Predicting
More informationPhylogeny Estimation and Hypothesis Testing using Maximum Likelihood
Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood For: Prof. Partensky Group: Jimin zhu Rama Sharma Sravanthi Polsani Xin Gong Shlomit klopman April. 7. 2003 Table of Contents Introduction...3
More informationLecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).
1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff
More informationPROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS
Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:
More information8 Grundlagen der Bioinformatik, SS 09, D. Huson, April 28, 2009
8 Grundlagen der Bioinformatik, SS 09, D. Huson, April 28, 2009 2 Pairwise alignment We will discuss: 1. Strings 2. Dot matrix method for comparing sequences 3. Edit distance and alignment 4. The number
More informationKaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging
Method KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging Zhang Zhang 1,2,3#, Jun Li 2#, Xiao-Qian Zhao 2,3, Jun Wang 1,2,4, Gane Ka-Shu Wong 2,4,5, and Jun Yu 1,2,4 * 1
More informationSequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best alignment path onsider the last step in
More informationSequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas Informal inductive proof of best alignment path onsider the last step in the best
More informationEVOLUTIONARY DISTANCE MODEL BASED ON DIFFERENTIAL EQUATION AND MARKOV PROCESS
August 0 Vol 4 No 005-0 JATIT & LLS All rights reserved ISSN: 99-8645 wwwjatitorg E-ISSN: 87-95 EVOLUTIONAY DISTANCE MODEL BASED ON DIFFEENTIAL EUATION AND MAKOV OCESS XIAOFENG WANG College of Mathematical
More information12/6/12. Dr. Sanjeeva Srivastava IIT Bombay. Primary Structure. Secondary Structure. Tertiary Structure. Quaternary Structure.
Dr. anjeeva rivastava Primary tructure econdary tructure Tertiary tructure Quaternary tructure Amino acid residues α Helix Polypeptide chain Assembled subunits 2 1 Amino acid sequence determines 3-D structure
More informationBio 1B Lecture Outline (please print and bring along) Fall, 2007
Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution
More informationTowards Understanding the Origin of Genetic Languages
Towards Understanding the Origin of Genetic Languages Why do living organisms use 4 nucleotide bases and 20 amino acids? Apoorva Patel Centre for High Energy Physics and Supercomputer Education and Research
More informationRead more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov
2018 Biochemistry 110 California Institute of Technology Lecture 2: Principles of Protein Structure Linus Pauling (1901-1994) began his studies at Caltech in 1922 and was directed by Arthur Amos oyes to
More informationSequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013
Sequence Alignments Dynamic programming approaches, scoring, and significance Lucy Skrabanek ICB, WMC January 31, 213 Sequence alignment Compare two (or more) sequences to: Find regions of conservation
More informationPROTEIN STRUCTURE AMINO ACIDS H R. Zwitterion (dipolar ion) CO 2 H. PEPTIDES Formal reactions showing formation of peptide bond by dehydration:
PTEI STUTUE ydrolysis of proteins with aqueous acid or base yields a mixture of free amino acids. Each type of protein yields a characteristic mixture of the ~ 20 amino acids. AMI AIDS Zwitterion (dipolar
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally
More informationInferring Complex DNA Substitution Processes on Phylogenies Using Uniformization and Data Augmentation
Syst Biol 55(2):259 269, 2006 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 101080/10635150500541599 Inferring Complex DNA Substitution Processes on Phylogenies
More informationPhylogenetic Inference using RevBayes
Phylogenetic Inference using RevBayes Model section using Bayes factors Sebastian Höhna 1 Overview This tutorial demonstrates some general principles of Bayesian model comparison, which is based on estimating
More informationSequence comparison: Score matrices
Sequence comparison: Score matrices http://facultywashingtonedu/jht/gs559_2013/ Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best
More informationLecture 14 - Cells. Astronomy Winter Lecture 14 Cells: The Building Blocks of Life
Lecture 14 Cells: The Building Blocks of Life Astronomy 141 Winter 2012 This lecture describes Cells, the basic structural units of all life on Earth. Basic components of cells: carbohydrates, lipids,
More informationBiochemistry Quiz Review 1I. 1. Of the 20 standard amino acids, only is not optically active. The reason is that its side chain.
Biochemistry Quiz Review 1I A general note: Short answer questions are just that, short. Writing a paragraph filled with every term you can remember from class won t improve your answer just answer clearly,
More informationOrganic Chemistry Option II: Chemical Biology
Organic Chemistry Option II: Chemical Biology Recommended books: Dr Stuart Conway Department of Chemistry, Chemistry Research Laboratory, University of Oxford email: stuart.conway@chem.ox.ac.uk Teaching
More informationMaximum Likelihood in Phylogenetics
Maximum Likelihood in Phylogenetics June 1, 2009 Smithsonian Workshop on Molecular Evolution Paul O. Lewis Department of Ecology & Evolutionary Biology University of Connecticut, Storrs, CT Copyright 2009
More information8 Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 18, 2011
8 Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 18, 2011 2 Pairwise alignment We will discuss: 1. Strings 2. Dot matrix method for comparing sequences 3. Edit distance and alignment 4. The number
More informationA phylogenetic view on RNA structure evolution
3 2 9 4 7 3 24 23 22 8 phylogenetic view on RN structure evolution 9 26 6 52 7 5 6 37 57 45 5 84 63 86 77 65 3 74 7 79 8 33 9 97 96 89 47 87 62 32 34 42 73 43 44 4 76 58 75 78 93 39 54 82 99 28 95 52 46
More informationUnderstanding relationship between homologous sequences
Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective
More informationMolecular Selective Binding of Basic Amino Acids by a Water-soluble Pillar[5]arene
Electronic supplementary information Molecular Selective Binding of Basic Amino Acids y a Water-solule Pillar[5]arene Chunju Li,* a, Junwei Ma, a Liu Zhao, a Yanyan Zhang, c Yihua Yu, c Xiaoyan Shu, a
More informationMolecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences
Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences Basic Bioinformatics Workshop, ILRI Addis Ababa, 12 December 2017 1 Learning Objectives
More informationMODELING EVOLUTION AT THE PROTEIN LEVEL USING AN ADJUSTABLE AMINO ACID FITNESS MODEL
MODELING EVOLUTION AT THE PROTEIN LEVEL USING AN ADJUSTABLE AMINO ACID FITNESS MODEL MATTHEW W. DIMMIC*, DAVID P. MINDELL RICHARD A. GOLDSTEIN* * Biophysics Research Division Department of Biology and
More informationLS1a Midterm Exam 1 Review Session Problems
LS1a Midterm Exam 1 Review Session Problems 1. n aqueous mixture of a weak acid and its conjugate base is often used in the laboratory to prepare solutions referred to as buffers. ne commonly used acid
More informationThe Phylo- HMM approach to problems in comparative genomics, with examples.
The Phylo- HMM approach to problems in comparative genomics, with examples. Keith Bettinger Introduction The theory of evolution explains the diversity of organisms on Earth by positing that earlier species
More informationSolutions In each case, the chirality center has the R configuration
CAPTER 25 669 Solutions 25.1. In each case, the chirality center has the R configuration. C C 2 2 C 3 C(C 3 ) 2 D-Alanine D-Valine 25.2. 2 2 S 2 d) 2 25.3. Pro,, Trp, Tyr, and is, Trp, Tyr, and is Arg,
More informationCHEMISTRY ATAR COURSE DATA BOOKLET
CHEMISTRY ATAR COURSE DATA BOOKLET 2018 2018/2457 Chemistry ATAR Course Data Booklet 2018 Table of contents Periodic table of the elements...3 Formulae...4 Units...4 Constants...4 Solubility rules for
More informationProtein Structure Bioinformatics Introduction
1 Swiss Institute of Bioinformatics Protein Structure Bioinformatics Introduction Basel, 27. September 2004 Torsten Schwede Biozentrum - Universität Basel Swiss Institute of Bioinformatics Klingelbergstr
More informationAdvanced Topics in RNA and DNA. DNA Microarrays Aptamers
Quiz 1 Advanced Topics in RNA and DNA DNA Microarrays Aptamers 2 Quantifying mrna levels to asses protein expression 3 The DNA Microarray Experiment 4 Application of DNA Microarrays 5 Some applications
More informationBasic Principles of Protein Structures
Basic Principles of Protein Structures Proteins Proteins: The Molecule of Life Proteins: Building Blocks Proteins: Secondary Structures Proteins: Tertiary and Quartenary Structure Proteins: Geometry Proteins
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More information7.05 Spring 2004 February 27, Recitation #2
Recitation #2 Contact Information TA: Victor Sai Recitation: Friday, 3-4pm, 2-132 E-mail: sai@mit.edu ffice ours: Friday, 4-5pm, 2-132 Unit 1 Schedule Recitation/Exam Date Lectures covered Recitation #2
More informationProtein Secondary Structure Prediction
part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 the goal is the prediction of the secondary structure conformation which is local each amino
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationModeling Noise in Genetic Sequences
Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:
More informationModels of Molecular Evolution
Models of Molecular Evolution Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison September 15, 2007 Genetics 875 (Fall 2009) Molecular Evolution September 15, 2009 1 /
More informationMolecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço
Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)
More informationCollision Cross Section: Ideal elastic hard sphere collision:
Collision Cross Section: Ideal elastic hard sphere collision: ( r r 1 ) Where is the collision cross-section r 1 r ) ( 1 Where is the collision distance r 1 r These equations negate potential interactions
More informationProtein Struktur. Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep. wake up at slide 39
Protein Struktur Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep wake up at slide 39 Andrew Torda, Wintersemester 2016/ 2017 Andrew Torda 17.10.2016 [ 1 ] Proteins - who cares?
More informationSection Week 3. Junaid Malek, M.D.
Section Week 3 Junaid Malek, M.D. Biological Polymers DA 4 monomers (building blocks), limited structure (double-helix) RA 4 monomers, greater flexibility, multiple structures Proteins 20 Amino Acids,
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationPractice Midterm Exam 200 points total 75 minutes Multiple Choice (3 pts each 30 pts total) Mark your answers in the space to the left:
MITES ame Practice Midterm Exam 200 points total 75 minutes Multiple hoice (3 pts each 30 pts total) Mark your answers in the space to the left: 1. Amphipathic molecules have regions that are: a) polar
More informationGenetic distances and nucleotide substitution models
4 Genetic distances and nucleotide substitution models THEORY Korbinian Strimmer and Arndt von Haeseler 4.1 Introduction One of the first steps in the analysis of aligned nucleotide or amino acid sequences
More informationProtein Fragment Search Program ver Overview: Contents:
Protein Fragment Search Program ver 1.1.1 Developed by: BioPhysics Laboratory, Faculty of Life and Environmental Science, Shimane University 1060 Nishikawatsu-cho, Matsue-shi, Shimane, 690-8504, Japan
More informationPart 2: Chemical Evolution
Part 2: Chemical Evolution The figure is a cartoon representation of the protein Myosin. Myosin and actin are the two proteins responsible for muscle contraction. www.lifesorigin.com 87 Chapter 5:Information
More informationA Plausible Model Correlates Prebiotic Peptide Synthesis with. Primordial Genetic Code
Electronic Supplementary Material (ESI) for ChemComm. This journal is The Royal Society of Chemistry 2018 A Plausible Model Correlates Prebiotic Peptide Synthesis with Primordial Genetic Code Jianxi Ying,
More informationCounting labeled transitions in continuous-time Markov models of evolution
Journal of Mathematical Biology manuscript No. (will be inserted by the editor) Counting labeled transitions in continuous-time Markov models of evolution Vladimir N. Minin Marc A. Suchard Received: date
More informationA study of matrix energy during peptide formation through chemical graphs
International Journal of Mathematics and Soft Computing Vol.3, No.2 (2013), 11-15. ISSN Print : 2249-3328 ISSN Online: 2319-5215 A study of matrix energy during peptide formation through chemical graphs
More information