Chapter 3: Phylogenetics


 Susanna Nelson
 2 years ago
 Views:
Transcription
1 Chapter 3: Phylogenetics 3. Computing Phylogeny Prof. Yechiam Yemini (YY) Computer Science epartment Columbia niversity Overview Computing trees istancebased techniques Maximal Parsimony (MP) techniques Maximum likelihood techniques This chapter is based on urbin Chapter 7 lso recommended: The Phylogenetic Handbook, Salemi and andamme 00
2 Can e Tell volution rom Homology uplication Partial sample Speciation 3 3B B B Phylogeny How do we tell the right tree? 3 B 3 B 3 Phylogeny: Computing Trees INPT: Y GGGCT TGCCC TGCTT TGCC TGCGCTT Phylogeny OTPT: Y
3 Brute orce pproach Brute orce numerate all trees Compute some measure of evolutionary likelihood Select best tree How many rooted trees are there with n leaves? n= leaves => tree n=3 leaves =>attach 3 rd leaf to 3 edges => 3 trees Let T(n)= # rooted trees with n leaves; (n) = # edges T()=, ()=3; T(3)=3, (3)= ddition of a leaf creates two new edges => (n)=(n)+=> (n)=n T(n)=T(n)*(n)=T(n)*(n3) => T(n)= *3** (n3) or n=0 leaves ~0 pproaches istance based Tree should best model evolutionary distance metric among taxa Characterbased [Maximal Parsimony (MP)] Tree should minimize changes Maximum likelihood (ML) Tree should maximize likelihood of changes INPT: Y GGGCT TGCCC TGCTT TGCC TGCGCTT Phylogeny OTPT: Y 6 3
4 istance Based Techniques 7 I. istance Based Techniques Key Idea: Compute evolutionary distance metric among S={,,,,Y} Compute a tree on S that best fits the distances ormally: Given: nxn distance matrix Compute: weighted tree T on n leaves that best fits How to establish evolutionary distance measures? istance ~ changes Next chapter: evaluating distance using Markovian evolution models 8
5 Is There Tree That Perfectly its? Not every distance metric can be modeled by a tree How can we tell distance metrics that model a tree?...? 9 The ourpoint condition distance matrix corresponding to a tree is called additive THORM: is additive if and only if: or every four indices i,j,k,l, the maximum and median of the three pairwise sums are identical: ij + kl < ik + jl = il + jk Suggests how to connect points into a tree to fit i l ik il ij kl < = jl j k jk... 0
6 How o e Handle Nondditive? dditive metrics are very useful Provide perfect fit with a tree model; tree is easily computed from But evolutionary distance metrics are often nonadditive How do we handle nonadditive metric? itch & Margoliash: find a tree T to minimize leastsquare fit: (T) = i,j (d ij (T) ij ) This problem is NPHard need heuristics itch & Margoliash (968) exhaustive search ClosestPair Clustering Idea: use to guide closestpair clustering xtend to clusters by PGM/PGM averaging 6
7 PGM lgorithm Initialization Initialize n clusters C i ={S i } Initialize T with leaves for each cluster Ci Iteration ind C i, C j with smallest distance ij Create new cluster C k = C i C j dd a new node to T, for C k, and connect it to C i,c j If all nodes are connected to a tree exit; otherwise, assign ki = kj = ij / and compute the distances kl to all clusters C l il C i + jl C j kl = C i + C j Repeat the iteration 3 PGM: Molecular Clock Property niform distance from root to leaves istance to root ~ evolutionary clock Species are assumed to take identical time to evolve
8 Notes Complexity is is O(n ) veraging redistributes distances to overcome nonadditivity Clustering can lead to substantial errors and is very sensitive This limits the applications of clustering How do we overcome the sensitivity of PGM? Real tree PGM Improvements Through Bootstrapping Bootstrapping: statistical technique to increase robustness Scenario: given a sample S(ω) and a result R(S) computed from S Bootstrapping: o Resample S, to get S (ω); o valuate R(S (ω)); o valuate match of R(S) with the values R(S (ω)) In here S= columns of sequences of size n; R(S)=tree S (ω)=sample n random columns of S with possible repetitions Compute phylogenetic tree R(S (ω)) se {R(S (ω))} to compute consensus/likelihood of branches of R(S) 6 8
9 Bootstrapping xample 7 Closest Pair vs. volutionaryneighbors dditivity: ij + kl < ik + jl = il + jk i l ik il ij kl < = jl j k PGM overcomes nonadditivity by averaging distances But, the closest pair may not be evolutionary neighbors The evolutionary tree distances may diverge greatly; averaging distorts neighborhood jk 8 9
10 Neighbor Joining [Saitou & Nei 87; Studier & Keppler 88] Neighbor joining heuristics: join closest clusters that are far from the rest efine: R k =Σ i k ik the divergence of k Cluster nodes k,m that minimize km = km (R k +R m )/(n) [efine r k =R k /(n) and consider km r k r m ] km r k r m r Neighbor Joining lgorithm Initialization:(same as PGM) Initialize n clusters C i ={S i } Iteration:. Compute r k =Σ i k ik /(n) for each cluster k. ind (k,m) minimizing km r k r m ; 3. efine a new node i and set is = 0.( ks + ms  km ) for all s. Join node i to k and m with edges of respective lengths: ki =0.( km +r k r m ) mi =0.( km +r m r k ). Repeat until all nodes are connected 0 0
11 xample: Step Compute ivergences r B C Σ B C Step B C Step : compute r k =Σ i k ik /(n) Sum the columns then divide by 6= r rom The Phylogenetic Handbook, Salemi and andamme 00 Step : find neighboring pair Step : evaluate neighboring distance matrix N km = km (r k +r m ) [Subtract the r column & row] ind (k,m) minimizing N km Create a new node and attach to k,m B C B C PGM would connect the closest pair Step B C B C B C Min{ Min{N km km }
12 Step 3,: Join Neighbors pdate istances Step 3: Compute the branch lengths,b =0.( B +r r B )=0.(3)= B =0.( B +r B r )=0.(+3)= Step : pdate distance matrix = 0.( + B  B ) C = 0.(+7)=3; =0.(7+0)=6 =0.(6+9)=; =0.(8+)=7 B C B C Step C C Step 3 B C 3 Repeat Steps //3/ r C Step C C 3 Step C Step : compute r k =Σ i k ik /(n) Step : compute neighboring pair Min{N Y = Y r r Y } => (,C) or (,) Step 3: join neighbors; compute branch length =0.( C +r r C )=; C = Step : recompute distances = 0.( + C  C ) Step 3 B C Step
13 Repeat Step Step : compute r k =Σ i k ik /(n) Step : compute neighboring pair Min{N Y = Y r r Y } => (,) Step 3: join neighbors; compute branch length =0.( +r r )=3; = Step : recompute distances = 0.( +  ) r Step Step 3 C 3 Step B Repeat Step Step : compute r k =Σ i k ik /(n) Step : compute neighboring pair Min{N Y = Y r r Y } => (,) Step 3: join neighbors; compute branch length Z =0.( +r r )=; Z = Step : recompute distances Z = 0.( +  ) r 8 8 Step Step 3 C Z 3 Step Z Z B 6 3
14 7 Complete B C 3 Z Z Z B C 3 Z 8 Notes On Neighbors Joining Complexity is O(n ) oes not depend on molecular clock assumption Heavily used in practice [e.g., Clustal ] But can be sensitive to nonadditivity
15 Maximal Parsimony (character based phylogeny) 9 Key Idea: Minimize Changes Reconsider the problem: ind best tree to explain evolution of sequences Motivation: focus on evolution of positions istance loses information on evolutionary changes TTCTG TTCT GTTGCT TTGCT Key idea: find tree with minimal changes to explain data G GG G C= G G GG C=3 G G GG G 30
16 More Generally Taxa are considered as sets of attributes: characters character = N position, genes order, morphological feature character state = a value assumed by a character Characters evolve through state changes volutionary tree represents changes in character states MPtree seeks to minimize state changes 3 MP xample Characters Binary states Taxa state change 3 6
17 MP xample 7 state changes 6 state changes 33 xample: volution of Gene Taxa Character = position State = nucleotide 3 7
18 xample: volution of Gene Character = position State = nucleotide Taxa 3 xample MP rearrangements of chromosome Pevzner 003 Genome Research 36 8
19 The Max Parsimony (MP) Problem Big MP: Input: set of n aligned sequences of length k Output: phylogenetic tree T such that o T has n leaves labeled with the input sequences (taxa) o T has internal nodes labeled with sequences of length k (states) o T minimizes the Hamming distance among its node labels H=3 G This is a Steiner Tree type problem Can be shown to be NP hard [Gusfield, oulds] But often the number of sequences considered is small G GG G Small MP Input: a tree with sequencelabeled leaves Output: labeling of internal nodes states which max parsimony 37 MP Basics Consider {T,TT, GTT, GT, GGT} irst column admits arrangements & identifies likely mutation T G TT G 3 G GTT GT G G 3 G GGT MP ( mutation) mutations Second column does not provide clues on likely mutations T G T T 3 T 3 T T T T G T TT GTT GT GGT Noninformative position (need at least characters) 38 9
20 MP Basics G 3 MP G G T T 3 T MP T TT GTT GT GGT Merge MP trees of columns & 3: T TT GTT TT GTT GTT 3 GGT GT T GT T TT GTT 3 TT GGT GTT Two MP trees 39 ardvark: CGGT Bison: CGC Chimp: CGGGT og: TGCCT lephant: TGCGT xample (N. riedman) TGGGT CGGT CGGGT TGCGT ardvark Bison Chimp og lephant CGGT CGC CGGGT TGCCT TGCGT 0 0
21 xample:volution of Protein omains Total Cost: 3 C. Chothia et al, volution of the Protein Repertoire, Science OL 300, 3 June 003 T. Przytycka et al, Graph Theoretical Insights., RCOMB 00, LNBI 300, pp. 33, 00 Single Site MP: The itch lgorithm Problem: Input: a tree T with labeled leaves Output: labels of internal nodes of MP tree + cost C Step : ssign to each node x a set of labels S(x) such that If x is a leaf then S(x)= label of x, C 0 If x has children y,z S(x) = if S(y) S(z) 0 then S(y) S(z) else S(y) S(z), C C+ Traverse T in postorder (leaves to root) Step : ssign to a node x a character value v(x) Traverse T in preorder (root to leaves) If y is the parent of x and v(y)εs(x) then v(x) v(y) else v(x)= any label from S(x)
22 Step : Computing Candidate Labels C= {} C= {, G} C= {} C= {, G} C= {, G} C=0 G G G G {} {G} {} {G} {} {G} {} {G} G G {} {G} {} {G} 3 Step : Selecting MP Labels {} {, G} {} {, G} {} C= {, G} {, G} {, G} {, G} G G {} {G} {} {G} G G {} {G} {} {G} G G {} {G} {} {G}
23 Notes lgorithm is fast O(nk) n= # nodes, k=#character values It selects a particular MP tree (there may be others) {, G} C= G G {, G} {} G G G G {} {G} {} {G} G G G G G G Run separately for each character then merge results May be generalized for weighted parsimony: Sankoff s generalization: different costs of different changes Heuristic MP lgorithms se Steinertree heuristic algorithms Branchandbound search Represent search space as tree (nodes at kth level represent phylogenetic trees for first k species) ind best scoring searchnode and use it as bound Branch to children of this searchnode Nearest neighbor interchange (NNI) switch subtrees Simulated annealing. 6 3
24 Maximal Likelihood pproach 7 (III) Max Likelihood pproaches (Based on N. riedman slides) Key idea: compute maximum likelihood tree Many models of changes (trees) can yield observed data Compute tree that maximizes the likelihood Problem : given T, compute probability P(S T) S={, n } are the observed sequences Need a probability model of changes generated by T: o Background probabilities: q(a) o Mutation probabilities: P(a b,t) x Problem : compute T that maximizes P(S T) This is the complex part x t t t t 3 x x x 3 8
25 Tree Likelihood Computation efine P(L k a)= prob. of subtree below node k given x k =a Init: for all leaves k; P(L k a)= if x k =a ; 0 otherwise Iteration: if k is node with children i and j, then " P(L k a) = P(b a,t i )L(i b)p(c a,t j )L( j c) b,c Termination:Likelihood is P( x, K, x3 T, t) =! P( Lroot a) q( a) a x t x t t t 3 x x x 3 9 Maximum Likelihood (ML) Score each tree by P (, K, n T, t) =! P( x[ m], K, xn[ m] T, t) m ssumption of independent positions ind the highest scoring tree xhaustive search Sampling methods (Metropolis) pproximation (consider only a subset of trees) 0
26 Comparison Tony eisstein, Neighborjoining Maximum parsimony Maximum likelihood ses only pairwise distances ses only shared derived characters ses all data Minimizes distance between nearest neighbors Minimizes total distance Maximizes tree likelihood given specific parameter values ery fast asily trapped in local optima Slow ssumptions fail when evolution is rapid ery slow Highly dependent on assumed evolution model Good for generating tentative tree, or choosing among multiple trees Best option when tractable (<30 taxa) Good for very small data sets and for testing trees built using other methods Conclusions Computing phylogeny is an area of active research Hundreds of algorithms. New models: phylogenetic networks (generalize trees) New challenges: whole genome phylogeny ccount for multisite changes: replication, transpositions New algorithms pplications pidemiology Cancer diagnosis. 6
Evolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis YoungRae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds DistanceBased Evolutionary Tree Reconstruction CharacterBased
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationTree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istancebased methods Ultrametric Additive: UPGMA Transformed istance NeighborJoining Characterbased Maximum Parsimony Maximum Likelihood
More informationPage 1. Evolutionary Trees. Why build evolutionary tree? Outline
Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istancebased vs. characterbased methods. istancebased: Ultrametric Trees dditive Trees. haracterbased: Perfect phylogeny
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 85859: Origin of Species 5 year voyage of H.M.S. eagle (8336) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationTheory of Evolution. Charles Darwin
Theory of Evolution harles arwin 85859: Origin of Species 5 year voyage of H.M.S. eagle (86) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationCS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based  October 10, 2003
CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based  October 10, 2003 Lecturer: WingKin Sung Scribe: Ning K., Shan T., Xiang
More informationDr. Amira A. ALHosary
Phylogenetic analysis Amira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut UniversityEgypt Phylogenetic Basics: Biological
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distancebased methods
More informationBioinformatics 1  lecture 9. Phylogenetic trees Distancebased tree building Parsimony
ioinformatics  lecture 9 Phylogenetic trees istancebased tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branchpoint (bifurcation),
More informationAmira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut UniversityEgypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 5
CSCI1950 Z Computa4onal Methods for Biology Lecture 5 Ben Raphael February 6, 2009 hip://cs.brown.edu/courses/csci1950 z/ Alignment vs. Distance Matrix Mouse: ACAGTGACGCCACACACGT Gorilla: CCTGCGACGTAACAAACGC
More informationPhylogeny Tree Algorithms
Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edgeweighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary
CSCI1950 Z Computa4onal Methods for Biology Lecture 4 Ben Raphael February 2, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary Parsimony Probabilis4c Method Input Output Sankoff s & Fitch
More informationMichael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D
7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods NeighborJoining Method and Related Neighbor Methods Maximum Likelihood
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distancebased methods Ultrametric Additive: UPGMA Transformed Distance NeighborJoining Characterbased Maximum Parsimony Maximum Likelihood
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationPhylogeny: traditional and Bayesian approaches
Phylogeny: traditional and Bayesian approaches 5Feb2014 DEKM book Notes from Dr. B. John Holder and Lewis, Nature Reviews Genetics 4, 275284, 2003 1 Phylogeny A graph depicting the ancestordescendent
More informationMolecular Evolution and Phylogenetic Tree Reconstruction
1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, MariePauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationInferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies
Inferring Phylogenetic Trees Distance Approaches Representing distances in rooted and unrooted trees The distance approach to phylogenies given: an n n matrix M where M ij is the distance between taxa
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationPhylogeny: building the tree of life
Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan
More informationPhylogenetic trees 07/10/13
Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571  Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distancebased methods Evolutionary Models and Distance Correction
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0706 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationPhylogenetics: Parsimony
1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaflabeled with S Assumptions Characters are mutually independent
More informationNJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees
NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana
More informationPhylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University
Phylogenetics: Parsimony and Likelihood COMP 571  Spring 2016 Luay Nakhleh, Rice University The Problem Input: Multiple alignment of a set S of sequences Output: Tree T leaflabeled with S Assumptions
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More informationIs the equal branch length model a parsimony model?
Table 1: n approximation of the probability of data patterns on the tree shown in figure?? made by dropping terms that do not have the minimal exponent for p. Terms that were dropped are shown in red;
More informationInDel 35. InDel 89. InDel 35. InDel 89. InDel InDel 89
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis) advantages of different information types
More informationWalks in Phylogenetic Treespace
Walks in Phylogenetic Treespace lan Joseph aceres Samantha aley John ejesus Michael Hintze iquan Moore Katherine St. John bstract We prove that the spaces of unrooted phylogenetic trees are Hamiltonian
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics  in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa.  before we review the
More informationPhylogeny Jan 5, 2016
גנומיקה חישובית Computational Genomics Phylogeny Jan 5, 2016 Slides: Adi Akavia Nir Friedman s slides at HUJI (based on ALGMB 98) Anders Gorm Pedersen,Technical University of Denmark Sources: Joe Felsenstein
More informationMolecular Evolution & Phylogenetics
Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures JeanBaka Domelevo Entfellner Learning Objectives know basic
More informationPhylogeny. November 7, 2017
Phylogeny November 7, 2017 Phylogenetics Phylon = tribe/race, genetikos = relative to birth Phylogenetics: study of evolutionary relationships among organisms, sequences, or anything in between Related
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 2001001 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationInferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution
Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More informationPhylogenetics. BIOL 7711 Computational Bioscience
Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationBuilding Phylogenetic Trees UPGMA & NJ
uilding Phylogenetic Trees UPGM & NJ UPGM UPGM Unweighted PairGroup Method with rithmetic mean Unweighted = all pairwise distances contribute equally. PairGroup = groups are combined in pairs. rithmetic
More informationPhylogenetic analyses. Kirsi Kostamo
Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,
More informationLecture 10: Phylogeny
Computational Genomics Prof. Ron Shamir & Prof. Roded Sharan School of Computer Science, Tel Aviv University גנומיקה חישובית פרופ' רון שמיר ופרופ' רודד שרן ביה"ס למדעי המחשב,אוניברסיטת תל אביב Lecture
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationPlan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method
Phylogeny 1 Plan: Phylogeny is an important subject. We have 2.5 hours. So I will teach all the concepts via one example of a chain letter evolution. The concepts we will discuss include: Evolutionary
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationTheDiskCovering MethodforTree Reconstruction
TheDiskCovering MethodforTree Reconstruction Daniel Huson PACM, Princeton University Bonn, 1998 1 Copyright (c) 2008 Daniel Huson. Permission is granted to copy, distribute and/or modify this document
More informationBMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)
BMI/CS 776 Lecture #20 Alignment of whole genomes Colin Dewey (with slides adapted from those by Mark Craven) 2007.03.29 1 Multiple whole genome alignment Input set of whole genome sequences genomes diverged
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11 THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationA Phylogenetic Network Construction due to Constrained Recombination
A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer
More informationLet S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT
More informationSequential Monte Carlo Algorithms
ayesian Phylogenetic Inference using Sequential Monte arlo lgorithms lexandre ouchardôté *, Sriram Sankararaman *, and Michael I. Jordan *, * omputer Science ivision, University of alifornia erkeley epartment
More informationReconstruire le passé biologique modèles, méthodes, performances, limites
Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire
More informationPhylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches
Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell
More informationPhylogeny. Properties of Trees. Properties of Trees. Trees represent the order of branching only. Phylogeny: Taxon: a unit of classification
Multiple sequence alignment global local Evolutionary tree reconstruction Pairwise sequence alignment (global and local) Substitution matrices Gene Finding Protein structure prediction N structure prediction
More information17 Noncollinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on:
17 Noncollinear alignment This exposition is based on: 1. Darling, A.E., Mau, B., Perna, N.T. (2010) progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147.
More information66 Bioinformatics I, WS 0910, D. Huson, December 1, Evolutionary tree of organisms, Ernst Haeckel, 1866
66 Bioinformatics I, WS 0910, D. Huson, December 1, 2009 5 Phylogeny Evolutionary tree of organisms, Ernst Haeckel, 1866 5.1 References J. Felsenstein, Inferring Phylogenies, Sinauer, 2004. C. Semple
More informationPhylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Bayesian Phylogenetic Analysis COMP 571  Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X
More informationPhylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science
Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.
More informationPhylogenetics: Likelihood
1 Phylogenetics: Likelihood COMP 571 Luay Nakhleh, Rice University The Problem 2 Input: Multiple alignment of a set S of sequences Output: Tree T leaflabeled with S Assumptions 3 Characters are mutually
More informationPhylogenetic inference: from sequences to trees
W ESTFÄLISCHE W ESTFÄLISCHE W ILHELMS U NIVERSITÄT NIVERSITÄT WILHELMSU ÜNSTER MM ÜNSTER VOLUTIONARY FUNCTIONAL UNCTIONAL GENOMICS ENOMICS EVOLUTIONARY Bioinformatics 1 Phylogenetic inference: from sequences
More informationFinding the best tree by heuristic search
Chapter 4 Finding the best tree by heuristic search If we cannot find the best trees by examining all possible trees, we could imagine searching in the space of possible trees. In this chapter we will
More information(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise
Bot 421/521 PHYLOGENETIC ANALYSIS I. Origins A. Hennig 1950 (German edition) Phylogenetic Systematics 1966 B. Zimmerman (Germany, 1930 s) C. Wagner (Michigan, 19202000) II. Characters and character states
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION  theory that groups of organisms change over time so that descendeants differ structurally
More informationProperties of normal phylogenetic networks
Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationDNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi
DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571  Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationIsolating  A New Resampling Method for Gene Order Data
Isolating  A New Resampling Method for Gene Order Data Jian Shi, William Arndt, Fei Hu and Jijun Tang Abstract The purpose of using resampling methods on phylogenetic data is to estimate the confidence
More informationClustering. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein. Some slides adapted from Jacques van Helden
Clustering Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Some slides adapted from Jacques van Helden Small vs. large parsimony A quick review Fitch s algorithm:
More informationCopyright 2000 N. AYDIN. All rights reserved. 1
Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment
More informationSeuqence Analysis '17lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony
Seuqence nalysis '17lecture 10 Trees types of trees Newick notation UPGM Fitch Margoliash istance vs Parsimony Phyogenetic trees What is a phylogenetic tree? model of evolutionary relationships  common
More informationReconstruction of certain phylogenetic networks from their treeaverage distances
Reconstruction of certain phylogenetic networks from their treeaverage distances Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu October 10,
More informationCopyright notice. Molecular Phylogeny and Evolution. Goals of the lecture. Introduction. Introduction. December 15, 2008
opyright notice Molecular Phylogeny and volution ecember 5, 008 ioinformatics J. Pevsner pevsner@kennedykrieger.org Many of the images in this powerpoint presentation are from ioinformatics and Functional
More informationOrganisatorische Details
Organisatorische Details Vorlesung: Di 1314, Do 1012 in DI 205 Übungen: Do 16:1518:00 Laborraum Schanzenstrasse Vorwiegend Programmieren in Matlab/Octave Teilnahme freiwillig. Übungsblätter jeweils
More informationTools and Algorithms in Bioinformatics
Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and
More informationWho Has Heard of This Problem? Courtesy: Jeremy Kun
P vs. NP 02201 Who Has Heard of This Problem? Courtesy: Jeremy Kun Runtime Analysis Last time, we saw that there is no solution to the Halting Problem. Halting Problem: Determine if a program will halt.
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian
More informationPerfect Phylogenetic Networks with Recombination Λ
Perfect Phylogenetic Networks with Recombination Λ Lusheng Wang Dept. of Computer Sci. City Univ. of Hong Kong 83 Tat Chee Avenue Hong Kong lwang@cs.cityu.edu.hk Kaizhong Zhang Dept. of Computer Sci. Univ.
More informationMETHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.
Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern
More informationIntroduction to Bioinformatics Introduction to Bioinformatics
Dr. rer. nat. Gong Jing Cancer Research Center Medicine School of Shandong University 2012.11.09 1 Chapter 4 Phylogenetic Tree 2 Phylogeny Evidence from morphological ( 形态学的 ), biochemical, and gene sequence
More informationReconstructing Trees from Subtree Weights
Reconstructing Trees from Subtree Weights Lior Pachter David E Speyer October 7, 2003 Abstract The treemetric theorem provides a necessary and sufficient condition for a dissimilarity matrix to be a tree
More informationTHE THREESTATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2SAT
COMMUNICATIONS IN INFORMATION AND SYSTEMS c 2009 International Press Vol. 9, No. 4, pp. 295302, 2009 001 THE THREESTATE PERFECT PHYLOGENY PROBLEM REDUCES TO 2SAT DAN GUSFIELD AND YUFENG WU Abstract.
More informationUsing Phylogenomics to Predict Novel Fungal Pathogenicity Genes
Using Phylogenomics to Predict Novel Fungal Pathogenicity Genes David DeCaprio, Ying Li, Hung Nguyen (sequenced Ascomycetes genomes courtesy of the Broad Institute) Phylogenomics Combining whole genome
More informationInferring Molecular Phylogeny
r. Walter Salzburger The tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 2 1. Molecular Markers Inferring Molecular Phylogeny 3 Immunological comparisons! Nuttall
More informationMath 239: Discrete Mathematics for the Life Sciences Spring Lecture 14 March 11. Scribe/ Editor: Maria Angelica Cueto/ C.E.
Math 239: Discrete Mathematics for the Life Sciences Spring 2008 Lecture 14 March 11 Lecturer: Lior Pachter Scribe/ Editor: Maria Angelica Cueto/ C.E. Csar 14.1 Introduction The goal of today s lecture
More informationPhylogene)cs. IMBB 2016 BecA ILRI Hub, Nairobi May 9 20, Joyce Nzioki
Phylogene)cs IMBB 2016 BecA ILRI Hub, Nairobi May 9 20, 2016 Joyce Nzioki Phylogenetics The study of evolutionary relatedness of organisms. Derived from two Greek words:» Phle/Phylon: Tribe/Race» Genetikos:
More information