A phylogenetic view on RNA structure evolution
|
|
- Annabel Neal
- 5 years ago
- Views:
Transcription
1 phylogenetic view on RN structure evolution Tanja esell, Bioinformatics Institute, Heinrich-Heine University Düsseldorf, ermany - February 06, Bled, Slovenia,
2 Modeling sequence evolution Sequence T T T d Sequence 2 T T
3 Modeling sequence evolution Sequence T T T d Sequence 2 T T Hamming distance H=3
4 Modeling sequence evolution Sequence T T Sequence 2 T T each sequence site does not evolve independently of the others
5 Example for site-specific interactions 2D-structure 3D-structure of RNs, e.g. trn, mrn... of proteins p codon positions
6 SIMULTION ESTIMTION Seq: UUUUUUUUU Seq2: UUUUUUUU Seq3: UUUUUU Seq: UUUUUUUUU Seq2: UUUUUUUU Seq3: UUUUUU
7 Model-based approaches T stationary and time homogeneous Markov model the probability that sequence x evolves to sequence y P xy (t) = exp(qt)
8 4x4 instantaneous rate matrix enerally Reversible (REV) 0 Q = (aπ + bπ + cπ T ) aπ bπ cπ T aπ (aπ + dπ + eπ T ) dπ eπ T bπ dπ (bπ + dπ + f π T ) f π T cπ eπ f π (cπ + eπ + f π ) T T T J69-Modell K-Modell HKY-Modell α α α β α β βπ απ βπ T α α α β β α βπ βπ απ T α α α α β β απ βπ βπ T T α α α β α β βπ απ βπ TN93-Modell F8-Modell TR-Modell βπ α π βπ T π π π T aπ bπ cπ T βπ βπ α 2 π T π π π T aπ dπ eπ T α π βπ βπ T π π π T bπ dπ f π T T βπ α 2 π βπ π π π cπ eπ f π
9 ompensatory mutation U U U U U Nucleotides in stem regions evolve in strong correlation with their pairing counterpart.
10 ompensatory mutation Schöniger and von Haeseler, 994 U U U U U U UU π π π U π π π U π π π U - π π π U - - π π π U - - π π π U - U π π π π U π U π UU π π π π U π π U π - - π π π U - π π U π - π π π U - - π π U - U π U π π π π U π UU π π π π π U π U π π - - π π π U - π U π π - π π π U - - π U - U π U π U π π π π UU U π π π π U π U π UU U - π π π - - π U π U π UU U - - π π π - π U π U π UU UU π U π U π U π U π U π U
11 How to simulate more complex interactions among nucleotide and other character based sequences? model, that represents a universal description of arbitrary complex dependencies among sites.
12 Neighbourhood system k =,, l sites in a (nucleotide) sequence x = (x,..., x l ) Sequence T T Sequence 2 T T Neighbourhood system N = (N k ) k=,2,,l :. N k {,..., l}, k / N k for each k 2. If i N k then k N i for each i, k. n k denotes the cardinality of N k.
13 Example: N (Pseudoknot) N = {} N 2 = {6} N 3 = {5} N 4 = {4} N 5 = {} N 6 = {} N 7 = {} N 8 = {} N 9 = {27} N = {26} N = {25} N 2 = {24} N 3 = {23} N 4 = {4} N 5 = {3} N 6 = {2} N 7 = {} N 8 = {} N 9 = {} N = {} N 2 = {} N 22 = {} N 23 = {3} N 24 = {2} N 25 = {} N 26 = {} N 27 = {9}
14 Example: N (Stem including stacking) N 8 = {9}, N 9 = {8, 27,, 26} N = {9, 27, 26, 25, } N = {, 26, 25, 24, 2} N 2 = {, 25, 24, 23, 3} N 3 = {2, 23, 24, 4} N 4 = {3} N 22 = {23} N 23 = {22, 3, 2, 24} N 24 = {23, 3, 2,, 25} N 25 = {24,,, 2, 26} N 26 = {25, 9,,, 27} N 27 = {26,, 9,, 28} N 28 = {27}...
15 Example: Ribozyme domain
16 Basic idea: Different substitution matrix for each site x l nk Q k x x x x 64 4 x 4 q(x) =... q() q(53) q(223) q() q(l) Only one mutation is allowed at the current site
17 U U U U U U UU π π π U π π π U π π π U - π π π U - - π π π U - - π π π U - U π π π π U π U π UU π π π π U π π U π - - π π π U - π π U π - π π π U - - π π U - U π U π π π π U π UU π π π π π U π U π π - - π π π U - π U π π - π π π U - - π U - U π U π U π π π π UU U π π π π U π U π UU U - π π π - - π U π U π UU U - - π π π - π U π U π UU UU π U π U π U π U π U π U
18 U U U U U U UU π π π U π π π U π π π U - U π U π U π UU π π π U π π π U π π π U - U π U π U π UU π π π U π π π U π π π U - U π U π U π UU U π π π U - π π π U - - π π π UU π U π U π U - - -
19 (k, i) U U U U U U UU π π π U π π π U π π π U U π π π π π π U π π π U π π π U U π π π π π π U π π π U π π π U U π π π U π U π U π UU U π U π U π UU U π U π U π UU UU π U π U π U
20 (k, i) U U U U U U UU π π π U π π π U π π π U U π π π π π π U π π π U π π π U U π π π π π π U π π π U π π π U U π π π U π U π U π UU U π U π U π UU U π U π U π UU UU π U π U π U
21 U π π π U π π π U π π π U U π π π U π π π U π π π U π π π U U π π π U π π π U π π π U π π π U U π π π U U U U U U π U π U π UU U π U π U π UU U π U π U π UU U U π U π U π U
22 y,..., y nk y,..., y nk y,..., y nk U y,..., y nk y,..., y nk π y,..., y nk π y,..., y nk π U y,..., y nk y,..., y nk π y,..., y nk π y,..., y nk π U y,..., y nk y,..., y nk π y,..., y nk π y,..., y nk π U y,..., y nk U y,..., y nk π y,..., y nk π y,..., y nk π y,..., y nk
23 Q = {Q k k =,..., l} π k (y) if H(s k, y) = and x k y 0 Q k (s k, z) if H(s k, y) = 0 Q k (s k, y) = z n k + z s k 0 otherwise with s k = (x k, x i,..., x ink ) n k +, where {i,..., i nk } = N k y = (y 0, y... y nk ) n k + Normalisation: d k = z n k + π k (z) Q k (z, z) =.
24 The total instantaneous substitution rate for x: q(x) = l Q k (s k, s k ) k= Relative mutability at site k: Probability to replace x k by y 0 : P(k) = Q k(s k, s k ) q(x) P(x k y 0 ) = Q k (s k, y) Q k (s k, s k )
25 N changes N27 N N23 N24 N25 N N9 N N2 N N N N8 0.0 T T2 T3 T4 T5 T6 T7 T8 T9 T T T2 T3 T4 T5 SISSI: SImulating Sequence Evolution with Site-Specific Interactions (esell and von Haeseler, Bioinformatics in press, Epub. 05 Dec. 6) + + substitution model... }{{} 5 T T3 T2 T4 T5 T6 T7 T8 T9 T T T2 T3 UUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUUUUU UUUUUUUUUUUUUUUUUUUUUUU
26 Example: Bacillus subtilis RNase P database (Brown, 999) U U U U UU U U U U U U U U U U U U U U U U U P UU U P5 P5 U U U U U U U U U U U U U U P5. U UUUU U U U U U U U U UUUU U U U P3 UUU U U P4 U UU U U U U U U U U P2 U U U UUUUUU 5 P U 3 UU UU UU U P. P9 P8 P P7 P5. P2 P8 P9 Sequence M375, image created by Brown
27 Example: ounted frequencies from a RNase P sequence of Bacillus subtilis taken from the RNase P database: f nk = f nk =0 second nucleotide in doublet first nucleotide U U
28 Relationship between number of substitutions per site d and number of observed differences per site h: h: number of differences per site Fel8 Doublet d: number of substitutions per site
29 pilot study of SISSI Does phylogeny matter? phylogenetic view on some existing structure prediction methods.
30 Influence of the tree topology T3 T T5 0.0 T T T32 T2 T0 T T2 T34 T38 T28 T92 T6 T75 T T84 T67 T T T69 T76 T T59 T99 T96 T62 T2 T22 T23 T77 T5 T56 T87 T94 T T8 0.0 T6.0 T8 T79 T T4 T83 T36 T95 T T7 T T78 T3 T82 T4 T24 T T72 T53 T9 T8 T73 T5 T68 T T33 T T9 T7 0.0 T47 T6 T42 T55T89 T7 T85 T63 T7 T65 T25 T39 T45 T T4 T54 T93 T46 T9 T98 T27 T6 T64 T52 T44 T74 Examples with 5 bifurcating trees, with the same topology, but different mean branch length. + T58 T37 T48 T88 T49 T86 + substitutionmodel T 0.03, T 0.075, T 0., T 0.3, T 0.5
31 onstruct onstruction of RN consensus structures (Lück et al. 999), (Wilm,. & Steger,. 06, submitted) ombination of Sequence lignment, Thermodynamics and Mutual Information ontent. onsensus Structure Thermodynamic onsensus Dotplot: onsensus Dotplot using RNfold: Hofacker et al. (994) Mutual Information ontent: MI: (hiu & Kolodziejczak, 99, utell et al. 992) Prediction of Tertiary Interaction Maximum Weighted Matching: Tabaska et al. (998)
32 Mean branch length onsensus Structures based on Mutual Information ontent N
33 Mean branch length Thermodynamic onsensus Structures N
34 Prediction of tertiary interactions Using Maximum Weighted Matching: Tabaska et al. (998) N
35 Prediction of tertiary interactions Using Maximum Weighted Matching and a threshold N
36 Tree topology Fulltree: 0 sequences Subtree: sequences Is maximisation of evolutionary divergence useful for structure predictions?
37 Evolve along the fulltree and the subtree onsensus structures based on Mutual Information ontent Fulltree N Subtree
38 Evolve along the fulltree and the subtree, threshold onsensus structures based on Mutual Information ontent Fulltree N Subtree
39 Reducing the alignment onsensus structures based on Mutual Information ontent Fullalignment N Subalignment
40 Reducing the alignment onsensus structures based on Mutual Information ontent Fullalignment N Subalignment
41 Reducing the alignment Using Maximum Weighted Matching and a threshold Fullalignment N Subalignment
42 Influence of the tree topology Mutual Information: Long branches many true positives correlations Short branches few true positives correlations Many false positives omparative analysis ignores the phylogenetic information in the sequences, it tends to overestimate the amount of covariation between two positions.
43 ncestral correlations How long the branches need to be to avoid ancestral correlations? U U ancestral correlation UU
44 onclusion Including phylogenetic information in comparative analysis is potential useful Phylogeny can help to choose sequences for structure prediction methods: Statistical study is necessary. How looks the optimal tree for comparative structure prediction methods, with and without thermodynamics? Method for Reconstructing Dependencies with Phylogenetic Trees Self-consistent method, where no threshold is needed.
45 Extension of SISSI SISSI is not limited to F8 types of rate matrices E.g. inclusion of a transition-transversion parameter Inclusion of codon position-specific heterogeneity Studies with tertiary interactions Inclusion of mixture models Inclusion of energy values Inclusion of indels
46 cknowledgement rndt von Haeseler (enter for Integrative Bioinformatics Vienna) Thomas Schlegel (Bioinformatics Institute, HHU Düsseldorf) Minh Bui Quang (enter for Integrative Bioinformatics Vienna) Steffen Kläre (enter for Integrative Bioinformatics Vienna) erhard Steger (Institut für Physikalische Biologie, HHU Düsseldorf) ndreas Wilm (Institut für Physikalische Biologie, HHU Düsseldorf) Financial support from the DF grant SFB-TR is gratefully acknowledged.
47 Special thanks to the big communities of Phylogenetics and RN structures!
Conserved RNA Structures. Ivo L. Hofacker. Institut for Theoretical Chemistry, University Vienna.
onserved RN Structures Ivo L. Hofacker Institut for Theoretical hemistry, University Vienna http://www.tbi.univie.ac.at/~ivo/ Bled, January 2002 Energy Directed Folding Predict structures from sequence
More informationWhy IQ-TREE? Bui Quang Minh Australian National University. Workshop on Molecular Evolution Woods Hole, 24 July 2018
Why IQ-R? ui Quang Minh ustralian National University Workshop on Molecular volution Woods Hole, 24 July 2018 hanks to plenty of users for feedback and bug reports! ypical phylogenetic analysis under maximum
More informationRNAscClust clustering RNAs using structure conservation and graph-based motifs
RNsclust clustering RNs using structure conservation and graph-based motifs Milad Miladi 1 lexander Junge 2 miladim@informatikuni-freiburgde ajunge@rthdk 1 Bioinformatics roup, niversity of Freiburg 2
More informationSubstitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A
GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationTaming the Beast Workshop
Workshop David Rasmussen & arsten Magnus June 27, 2016 1 / 31 Outline of sequence evolution: rate matrices Markov chain model Variable rates amongst different sites: +Γ Implementation in BES2 2 / 31 genotype
More informationBioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre
Bioinformatics Scoring Matrices David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Learning Objectives To explain the requirement
More informationPhylogenetic Assumptions
Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by
More informationMutation models I: basic nucleotide sequence mutation models
Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or
More informationA Method for Aligning RNA Secondary Structures
Method for ligning RN Secondary Structures Jason T. L. Wang New Jersey Institute of Technology J Liu, JTL Wang, J Hu and B Tian, BM Bioinformatics, 2005 1 Outline Introduction Structural alignment of RN
More informationBMI/CS 776 Lecture 4. Colin Dewey
BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2
More informationMaximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018
Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras
More informationPhylogeny and evolution of structure
Phylogeny and evolution of structure Tanja Gesell and Peter Schuster Abstract Darwin s conviction that all living beings on Earth are related and the graph of relatedness is tree-shaped has been essentially
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally
More informationRNA Abstract Shape Analysis
ourse: iegerich RN bstract nalysis omplete shape iegerich enter of Biotechnology Bielefeld niversity robert@techfak.ni-bielefeld.de ourse on omputational RN Biology, Tübingen, March 2006 iegerich ourse:
More informationRNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17
RNA-Strukturvorhersage Strukturelle Bioinformatik WS16/17 Dr. Stefan Simm, 01.11.2016 simm@bio.uni-frankfurt.de RNA secondary structures a. hairpin loop b. stem c. bulge loop d. interior loop e. multi
More informationDiscrete & continuous characters: The threshold model
Discrete & continuous characters: The threshold model Discrete & continuous characters: the threshold model So far we have discussed continuous & discrete character models separately for estimating ancestral
More informationQuantifying sequence similarity
Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity
More informationPrediction of Conserved and Consensus RNA Structures
Prediction of onserved and onsensus RN Structures DISSERTTION zur Erlangung des akademischen rades Doctor rerum naturalium Vorgelegt der Fakultät für Naturwissenschaften und Mathematik der niversität Wien
More informationReconstruire le passé biologique modèles, méthodes, performances, limites
Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire
More informationSequence alignment methods. Pairwise alignment. The universe of biological sequence analysis
he universe of biological sequence analysis Word/pattern recognition- Identification of restriction enzyme cleavage sites Sequence alignment methods PstI he universe of biological sequence analysis - prediction
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony
More informationLecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).
1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff
More informationCladistics and Bioinformatics Questions 2013
AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction
More informationEvolutionary Analysis of Viral Genomes
University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral
More informationMolecular Evolution, course # Final Exam, May 3, 2006
Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least
More informationRNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology"
RNA Search and! Motif Discovery" Genome 541! Intro to Computational! Molecular Biology" Day 1" Many biologically interesting roles for RNA" RNA secondary structure prediction" 3 4 Approaches to Structure
More informationEvolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based
More informationGenetic distances and nucleotide substitution models
4 Genetic distances and nucleotide substitution models THEORY Korbinian Strimmer and Arndt von Haeseler 4.1 Introduction One of the first steps in the analysis of aligned nucleotide or amino acid sequences
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationEvolutionary Models. Evolutionary Models
Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment
More informationLecture 3: Markov chains.
1 BIOINFORMATIK II PROBABILITY & STATISTICS Summer semester 2008 The University of Zürich and ETH Zürich Lecture 3: Markov chains. Prof. Andrew Barbour Dr. Nicolas Pétrélis Adapted from a course by Dr.
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods
More informationPhylogeny. Information. ARB-Workshop 14/ CEH Oxford. Molecular Markers. Phylogeny The Backbone of Biology. Why? Zuckerkandl and Pauling 1965
Frank Oliver löckner Information Phylogeny Who are we: Dr. Frank Oliver löckner Dr. Jörg Peplies Max Planck Institute for Marine Microbiology Microbial enomics roup Bremen, ermany ontact: arb@mpibremen.de
More informationPage 1. Evolutionary Trees. Why build evolutionary tree? Outline
Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny
More informationThe Phylo- HMM approach to problems in comparative genomics, with examples.
The Phylo- HMM approach to problems in comparative genomics, with examples. Keith Bettinger Introduction The theory of evolution explains the diversity of organisms on Earth by positing that earlier species
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri RNA Structure Prediction Secondary
More informationLecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationPHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD
Annu. Rev. Ecol. Syst. 1997. 28:437 66 Copyright c 1997 by Annual Reviews Inc. All rights reserved PHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD John P. Huelsenbeck Department of
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More information3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT
3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationPhylogeny Tree Algorithms
Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some
More informationEVALUATION OF RNA SECONDARY STRUCTURE MOTIFS USING REGRESSION ANALYSIS
EVLTION OF RN SEONDRY STRTRE MOTIFS SIN RERESSION NLYSIS Mohammad nwar School of Information Technology and Engineering, niversity of Ottawa e-mail: manwar@site.uottawa.ca bstract Recent experimental evidences
More informationLecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22
Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and
More information98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006
98 Algorithms in Bioinformatics I, WS 06, ZBIT, D. Huson, December 6, 2006 8.3.1 Simple energy minimization Maximizing the number of base pairs as described above does not lead to good structure predictions.
More informationSecondary Structure Prediction. for Aligned RNA Sequences
Secondary Structure Prediction for ligned RN Sequences Ivo L. Hofacker, Martin Fekete, and Peter F. Stadler,, Institut für Theoretische hemie, niversität Wien, Währingerstraße 17, -1090 Wien, ustria The
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationLet S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationLecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and
More informationInferring Complex DNA Substitution Processes on Phylogenies Using Uniformization and Data Augmentation
Syst Biol 55(2):259 269, 2006 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 101080/10635150500541599 Inferring Complex DNA Substitution Processes on Phylogenies
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationMichael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D
7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood
More informationPhylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification
Multiple sequence alignment global local Evolutionary tree reconstruction Pairwise sequence alignment (global and local) Substitution matrices Gene Finding Protein structure prediction N structure prediction
More informationDNA/RNA Structure Prediction
C E N T R E F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 12 DNA/RNA Structure Prediction Epigenectics Epigenomics:
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationOutline. Sequence-comparison methods. Buzzzzzzzz. Why compare sequences? Gerard Kleywegt Uppsala University
MB330 - January, 2006 Sequence-comparison methods erard Kleywegt Uppsala University Outline! Why compare sequences?! Dotplots! airwise sequence alignments &! Multiple sequence alignments! rofile methods!
More informationCS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003
CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 Lecturer: Wing-Kin Sung Scribe: Ning K., Shan T., Xiang
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationInferring Molecular Phylogeny
Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction
More informationLikelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution
Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationSequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University
Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of
More informationBackground: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6)
Sequence lignment (chapter ) he biological problem lobal alignment Local alignment Multiple alignment Background: comparative genomics Basic question in biology: what properties are shared among organisms?
More informationSpecial features of phangorn (Version 2.3.1)
Special features of phangorn (Version 2.3.1) Klaus P. Schliep November 1, 2017 Introduction This document illustrates some of the phangorn [4] specialised features which are useful but maybe not as well-known
More informationImpact Of The Energy Model On The Complexity Of RNA Folding With Pseudoknots
Impact Of The Energy Model On The omplexity Of RN Folding With Pseudoknots Saad Sheikh, Rolf Backofen Yann Ponty, niversity of Florida, ainesville, S lbert Ludwigs niversity, Freiburg, ermany LIX, NRS/Ecole
More informationThe Causes and Consequences of Variation in. Evolutionary Processes Acting on DNA Sequences
The Causes and Consequences of Variation in Evolutionary Processes Acting on DNA Sequences This dissertation is submitted for the degree of Doctor of Philosophy at the University of Cambridge Lee Nathan
More informationπ b = a π a P a,b = Q a,b δ + o(δ) = 1 + Q a,a δ + o(δ) = I 4 + Qδ + o(δ),
ABC estimation of the scaled effective population size. Geoff Nicholls, DTC 07/05/08 Refer to http://www.stats.ox.ac.uk/~nicholls/dtc/tt08/ for material. We will begin with a practical on ABC estimation
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More informationarxiv: v1 [q-bio.pe] 4 Sep 2013
Version dated: September 5, 2013 Predicting ancestral states in a tree arxiv:1309.0926v1 [q-bio.pe] 4 Sep 2013 Predicting the ancestral character changes in a tree is typically easier than predicting the
More informationSymmetric Tree, ClustalW. Divergence x 0.5 Divergence x 1 Divergence x 2. Alignment length
ONLINE APPENDIX Talavera, G., and Castresana, J. (). Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Systematic Biology, -. Symmetric
More informationSequence Alignments. Dynamic programming approaches, scoring, and significance. Lucy Skrabanek ICB, WMC January 31, 2013
Sequence Alignments Dynamic programming approaches, scoring, and significance Lucy Skrabanek ICB, WMC January 31, 213 Sequence alignment Compare two (or more) sequences to: Find regions of conservation
More informationMolecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016
Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,
More informationUsing algebraic geometry for phylogenetic reconstruction
Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA
More informationStructure Local Multiple Alignment of RNA
Structure Local Multiple lignment of RN Wolfgang Otto 1 and Sebastian Will 2 and Rolf ackofen 2 1 ioinformatics, niversity Leipzig, D-04107 Leipzig wolfgang@bioinf.uni-leipzig.de 2 ioinformatics, lbert-ludwigs-niversity
More informationExpanded View Figures
Molecular Systems iology Evolutionary divergence of mouse P Mei-Sheng Xiao et al Expanded View Figures Nucleotide percentage (%) 0 20 40 60 80 100 T G Nucleotide percentage (%) 0 20 40 60 80 100 T G Frequency
More informationHMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM
I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington
More informationPredicting RNA Secondary Structure Using Profile Stochastic Context-Free Grammars and Phylogenic Analysis
Fang XY, Luo ZG, Wang ZH. Predicting RNA secondary structure using profile stochastic context-free grammars and phylogenic analysis. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 23(4): 582 589 July 2008
More informationproteins are the basic building blocks and active players in the cell, and
12 RN Secondary Structure Sources for this lecture: R. Durbin, S. Eddy,. Krogh und. Mitchison, Biological sequence analysis, ambridge, 1998 J. Setubal & J. Meidanis, Introduction to computational molecular
More information进化树构建方法的概率方法 第 4 章 : 进化树构建的概率方法 问题介绍. 部分 lid 修改自 i i f l 的 ih l i
第 4 章 : 进化树构建的概率方法 问题介绍 进化树构建方法的概率方法 部分 lid 修改自 i i f l 的 ih l i 部分 Slides 修改自 University of Basel 的 Michael Springmann 课程 CS302 Seminar Life Science Informatics 的讲义 Phylogenetic Tree branch internal node
More informationCONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018
CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationTheDisk-Covering MethodforTree Reconstruction
TheDisk-Covering MethodforTree Reconstruction Daniel Huson PACM, Princeton University Bonn, 1998 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document
More informationModeling Noise in Genetic Sequences
Modeling Noise in Genetic Sequences M. Radavičius 1 and T. Rekašius 2 1 Institute of Mathematics and Informatics, Vilnius, Lithuania 2 Vilnius Gediminas Technical University, Vilnius, Lithuania 1. Introduction:
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36
More informationLoss of information at deeper times
Loss of information at deeper times (and the origin of proteins?) David Penny Brisbane July 2014 The mathematicos caused the problem!!! Now they should solve it! And the origins of protein synthesis Okay,
More informationMETHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.
Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern
More informationMaximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A
J Mol Evol (2000) 51:423 432 DOI: 10.1007/s002390010105 Springer-Verlag New York Inc. 2000 Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus
More information