Lecture 10: Phylogeny
|
|
- Maryann Walsh
- 5 years ago
- Views:
Transcription
1 Computational Genomics Prof. Ron Shamir & Prof. Roded Sharan School of Computer Science, Tel Aviv University גנומיקה חישובית פרופ' רון שמיר ופרופ' רודד שרן ביה"ס למדעי המחשב,אוניברסיטת תל אביב Lecture 10: Phylogeny 25,27/12/12
2 Phylogeny Slides: Adi Akavia Nir Friedman s slides at HUJI (based on ALGMB 98) Anders Gorm Pedersen,Technical University of Denmark Sources: Joe Felsenstein Inferring Phylogenies (2004) 1
3 Phylogeny Phylogeny: the ancestral relationship of a set of species. Represented by a phylogenetic tree leaf branch?? Internal node??? Leaves - contemporary Internal nodes - ancestral Branch length - distance between sequences 2
4 3
5 4
6 5
7 6
8 Phylogeny schools classical Classical vs. Modern: Classical - morphological characters Modern - molecular sequences. 7
9 Trees and Models rooted / unrooted topology / distance binary / general molecular clock 8
10 To root or not to root? Unrooted tree: phylogeny without direction. 9
11 Rooting an Unrooted Tree We can estimate the position of the root by introducing an outgroup: a species that is definitely most distant from all the species of interest Proposed root Falcon Aardvark Bison Chimp Dog Elephant 10
12 A Scally et al. Nature 483, (2012) doi: /nature10842
13 HOW DO WE FIGURE OUT THESE TREES? TIMES? 12
14 Dangers of Paralogs Right species distance: (1,(2,3)) Gene Duplication Sequence Homology Caused By: Orthologs - speciation, Paralogs - duplication Xenologs - horizontal (e.g., by virus) Speciation events 1A 2A 3A 3B 2B 1B 13
15 Dangers of Paralogs Right species distance: (1,(2,3)) If we only consider 1A, 2B, and 3A: ((1,3),2) Gene Duplication Speciation events 1A 2A 3A 3B 2B 1B 14
16 Type of Data Distance-based Input: matrix of distances between species Distance can be fraction of residue they disagree on, alignment score between them, Character-based Examine each character (e.g., residue) separately 15
17 Distance Based Methods 16
18 Tree based distances d(i,j)=sum of arc lengths on the path i j Given d, can we find an exactly matching tree? An approximately matching tree? 17
19 The Problem Input: matrix d of distances between species Goal: Find a tree with leaves=chars and edge distances, matching d best. Quality measure: sum of squares: SSQ( T ) i j i t ij : distance in the tree = w ( d ij ij t ij 2 ) w ij : pair weighting. Options: (1) 1 (2) 1/d ij (3) 1/d ij 2 NP-hard (Day 86). We ll describe common heuristics 18
20 UPGMA Clustering (Sokal & Michener 1958) (Unweighted pair-group method with arithmetic mean) Approach: Form a tree; closer species according to input distances should be closer in the tree Build the tree bottom up, each time merging two smaller trees All leaves are at same distance from the root 19
21 Efficiency lemma Approach: gradually form clusters: sets of species Repeatedly identify two clusters and merge them. For clusters C i C j, define the distance between them to be the average dist betw their members: 1 d ( C = i, C j ) d ( p, q) C C i p C i q C j Lemma: If C k is formed by merging C i and C j then for every other cluster C l d(c k,c l ) = ( C i *d(c i,c l ) + C j *d(c j,c l )) / ( C i + C j ) Can update distances between clusters in time prop. to the number of clusters. j 20
22 UPGMA algorithm Initialize: each node is a cluster Ci={i}. d(ci,cj)=d(i,j) set height(i) = 0 i Iterate: Find Ci,Cj with smallest d(ci,cj) Introduce a new cluster node Ck that replaces Ci and Cj //Ck represents all the leaves in clusters Ci and Cj Introduce a new tree node Aij with height(aij) = d(ci,cj)/2 // d(ci,cj) is the average dist among leaves of Ci and Cj Connect the corresponding tree nodes Ci,Cj to Aij with length(ci, Aij) = height(aij) - height(ci) length(cj,aij) = height(aij) - height(cj) For all other Cl: d(ck,cl) = ( Ci *d(ci,cl) + Cj *d(cj,cl)) / ( Ci + Cj ) //dist to any old cluster is the ave dist between its leaves and leaves in Ci, Cj Time: Naïve: O(n 3 ); Can show O(n 2 logn) (ex.) and O(n 2 ) (harder ex.) 21
23 UPGMA alg (2) Orange nodes represent the groups of nodes that they replaced, and maintain the average dist of the set from other leaf nodes/clusters A h 123 A 123 h 12 A 12 h 12 A C 12 C 45 C 123 C 45 1,
24 UPGMA Algorithm
25 UPGMA Algorithm
26 UPGMA Algorithm
27 UPGMA Algorithm
28 UPGMA Algorithm
29 Molecular Clock UPGMA implicitly assumes the tree is ultrametric: equal leaf-root distances => common uniform clock Works reasonably well for nearby species 28
30 29 Additivity A weaker additivity assumption: distances between species are the sum of distances between intermediate nodes (even if the tree is not ultrametric) a b c i j k c b k j d c a k i d b a j i d + = + = + = ), ( ), ( ), (
31 30 Consequences of Additivity Suppose input distances are additive For any three leaves Thus a b c i j k c b k j d c a k i d b a j i d + = + = + = ), ( ), ( ), ( m )), ( ), ( ), ( ( ), ( j i d k j d k i d 2 1 m k d + =
32 31 Suppose input distances are additive For any four leaves Induced tree wt: W Largest sum must appear twice Consequences of Additivity II b W l j d k i d b W k j d l i d b W l k d j i d = + + = + + = + ), ( ), ( ), ( ), ( ), ( ), ( a b c i j k l d e
33 Consequences of Additivity III If we can identify neighbor leaves, then can use pairwise distances to reconstruct the tree: Remove neighbors i, j from the leaf set Add k i Set d km = (d im + d jm d ij )/2 m But can we find neighbor leaves? k j b 32
34 Closest pairs may not be neighbors! Closest pair: kj even though they are not neighbors k j 4 4 i l 33
35 Neighbor Joining (Saitou-Nei 87) Let where D i, j ) = d ( i, j ) ( r i + r ) ( j 1 r = i d( i, k) L 2 k Corrected average distance of i from all other nodes Theorem: if D(i,j) is minimum among all pairs of leaves, then i and j are neighbors in the tree 34
36 35 Neighbor Joining Set L to contain all leaves Iteration: Choose i,j such that D(i,j) is minimum Create new node k, and set remove i,j from L, and add k Update r, D Termination: when L =2 connect the two nodes 2 )) /, ( ), ( ), ( ( ), ( 2 ) / ), ( ( ), ( 2 ) / ), ( ( ), ( j i d m j d m i d m k d r r j i d k j d r r j i d k i d i j j i + = + = + = Thm: Opt tree guaranteed if distances match a tree Does not assume a clock Time: O(n 3 ) = k i k i d L r ), ( 2 1 ) ( ), ( ), ( j r i r j i d j i D + = j m i k
37 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D -
38 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D - i u i A ( )/2=32. 5 B ( )/2=23. 5 C ( )/2=23. 5 D ( )/2=29. 5
39 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D - A B C D A B C D - i u i A ( )/2=32. 5 B ( )/2=23. 5 C ( )/2=23. 5 D ( )/2=29. 5 D ij - u i - u j
40 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D - A B C D A B C D - i u i A ( )/2=32. 5 B ( )/2=23. 5 C ( )/2=23. 5 D ( )/2=29. 5 D ij - u i - u j
41 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D - A B C D A B C D - i u i A ( )/2=32. 5 B ( )/2=23. 5 C ( )/2=23. 5 D ( )/2=29. 5 C D X D ij - u i - u j
42 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D - A B C D A B C D - D ij - u i - u j i u i A ( )/2=32. 5 B ( )/2=23. 5 C ( )/2=23. 5 D ( )/2=29. 5 C 4 10 X D v C = 0. 5 x x ( ) = 4 v D = 0. 5 x x ( ) = 10
43 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D X A B C - 14 D - X - C 4 10 X D
44 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D X A B C - 14 D - X - D XA = ( D CA + D DA - D CD )/2 = ( )/2 = 17 D XB = ( D CB + D DB - D CD )/2 = ( )/2 = 8 C 4 10 X D
45 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D X A B C - 14 D - X - D XA = ( D CA + D DA - D CD )/2 = ( )/2 = 17 D XB = ( D CB + D DB - D CD )/2 = ( )/2 = 8 C 4 10 X D
46 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B X A B - 8 X - D XA = ( D CA + D DA - D CD )/2 = ( )/2 = 17 D XB = ( D CB + D DB - D CD )/2 = ( )/2 = 8 C 4 10 X D
47 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B X A B - 8 X - i C 4 10 X D u i A ( 17+17)/1 = 34 B ( 17+8) /1 = 25 X ( 17+8) /1 = 25
48 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B X A B - 8 X - A B X A B X - i C 4 10 X D u i A ( 17+17)/1 = 34 B ( 17+8) /1 = 25 X ( 17+8) /1 = 25 D ij - u i - u j
49 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B X A B - 8 X - A B X A B X - i C 4 10 X D u i A ( 17+17)/1 = 34 B ( 17+8) /1 = 25 X ( 17+8) /1 = 25 D ij - u i - u j
50 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B X A B - 8 X - A B X A B X - D ij - u i - u j i C 4 10 X D u i A ( 17+17)/1 = 34 B ( 17+8) /1 = 25 X ( 17+8) /1 = 25 v A = 0. 5 x x ( 34-25) = 13 v D = 0. 5 x x ( 25-34) = 4 A 13 Y 4 B
51 Neighbor Joining Algorithm A B X Y CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B - 8 X - Y C 4 10 X D A 13 Y 4 B
52 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B X Y A B - 8 X - 4 Y D YX = ( D AX + D BX - D AB )/2 = ( )/2 = 4 C 4 10 X D A 13 Y 4 B
53 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS X Y X - 4 Y - D YX = ( D AX + D BX - D AB )/2 = ( )/2 = 4 C 4 10 X D A 13 Y 4 B
54 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS X Y X - 4 Y - D YX = ( D AX + D BX - D AB )/2 = ( )/2 = 4 C 4 10 D A 13 4 B 4
55 Neighbor Joining Algorithm CENTER FOR BIOLOGICAL SEQUENCE ANALYSIS A B C D A B C - 14 D - D 10 C B 13 A
56 Division of Population Genetics, National Institute of Genetics & Department of Genetics, Graduate University for Advanced Studies (Sokendai) Mishima, , Japan Naruya Saitou
57 Probabilistic approaches 56
58 Likelihood of a Tree Given: n aligned sequences M= X 1,,X n A tree T, leaves labeled with X 1,,X n reconstruction t: labeling of internal nodes branch lengths Goal: Find optimal reconstruction t* : One maximizing the likelihood P(M T,t*) 57
59 Likelihood (2) We need a model for computing P(M T,t*) Assumptions: Each character is independent The branching is a Markov process: The probability that a node x has a specific label is only a function of the parent node y and the branch length t between them. The probabilities P(x y,t) are known 58
60 59 Example x 1 x 2 x 3 x 4 x 5 t 1 t 2 t 3 t 5 ) ( ), ( ), ( ), ( ), ( *),,..., ( x P t x x P t x x P t x x P t x x P t T x x P =
61 = Calculating the Likelihood Assume that the branch lengths t uv are known. P(M T, character t*) = character j reconstrcuction P( root) P( M j j reconstrcuction R edge u v R, R T, t*) p u v ( t uv ) Independence of sites Markov property independence of each branch 60
62 Additional Assumed Properties Additivity: P ( a b, t) P( b c, s) = P( a c, s + t) b Reversibility (symmetry): q P( a b, t) q P( b a, t) b = Allows one to freely move the root a Provable under broad and reasonable assumptions 61
63 Jukes-Cantor Model (J-K 69) Assumptions: Each base in sequence has equal chance of changing Changes to other 3 bases with equal probability Characteristics Each base appears with equal frequency in DNA The quantity r is the rate of change NOTE: r is not the rate of mutation, it is the rate of substitution event of a nucleotide throughout a population. During each infinitesimal time t a substitution occurs with probability 3r t 62
64 a=r 63
65 Jukes-Cantor Model P A(t) : prob of A in the character at time t Discrete case: Prob. change A B in one time unit is r P A(t+1) = (1-3r) P A(t) + r (1- P A(t) ) P A(t+1) - P A(t) = -3r P A(t) + r (1- P A(t) ) Continuous case: P A(t) = -3r P A(t) + r (1- P A(t) )= -4r P A(t) + r Solution to the differential equation: P A(t) = 1 4 rt (1 + 3e 4 ) 64
66 prob. that the nucleotide remains unchanged 1 rt over t time units: P 3 4 same = + e 4 4 Probability of specific change: Probability of change: Note: For t P change 3 4 P A B = 1 1 4rt 4 e rt P change = e
67 Other Models Kimura s 2-parameter model: A,G - purines; C,T - pyrmidines Two different rates purine-purine or pyrmidine-pyrimidine (transitions) purine-pyrmidine or pyrmidine-purine (transversions) Felsenstein 84 and Yano, Hasegawa & Kishino 85 extend the Kimura model to asymmetric base frequencies. C T A G 66
68 Charles Cantor Boston University Professor, Biomedical Engineering Professor of Pharmacology, School of Medicine Center for Advanced Biotechnology Ph.D., Biophysical Chemistry, University of California, Berkeley Dr. Cantor is currently Chief Scientific Officer at Sequenom, Inc in San Diego, California. 67
69 Efficient Likelihood Calculation (Felsenstein 73) Use dynamic programming DefineS j (a,v) = Pr(subtree rooted in v v j = a) Initialization: leafv set S j (a,v) = 1 if v is labeled by a, else S j (a,v) = 0 Recursion: Traverse the tree in postorder: for each node v with children u and w, for each state x S j ( x, v) Final Soln: = S j ( y, u) px y ( tvu ) S y y j ( y, w) p L = S j ( x, root) P( x) j x x y ( t vw ) Complexity: O(nmk 2 ) n species, m chars, k states 68
70 69
71 Finding Optimal Branch lengths Optimizing the length of a single root-child branch r-v can be done using standard optimization techniques log L m = log j= 1 x, y p( x) S" j ( x, r) p x y ( t rv ) S j ( y, v) Under the symmetry assumption, each node can be made (temporarily) the root To heuristically optimize all the branch lengths: repeatedly optimize one branch at a time No guaranteed convergence, but often works 70
72 Character Based Methods 71
73 Inferring a Phylogenetic Tree Generic problem: Optimal Phylogenetic Tree: Input: n species, set of characters, for each species, the state of each of the characters. (parameters) Goal: find a fully-labeled phylogenetic tree that best explains the data. (maximizes a target function). Assumptions: characters - mutually independent species evolve independently A: CAGGTA B: CAGACA C: CGGGTA D: TGCACT E: TGCGTA A B C D E 72
74 A Simple Example Five species, three have C and two T at a specified position. A minimal tree has one evolutionary change: C T C C T C C T T C 73
75 Inferring a Phylogenetic Tree Naive Solution - Enumeration: No. of non-isomorphic, labeled, binary, unrooted trees, containing n leaves: (2n -5)!! = Π i=3 n (2i-5) Rooted: x(2n-3) for n=20 this is 10 21! n leaves n-2 internal nodes 2n-3 edges Pf: induction on n a b Each new species adds 2 new edges a b c a c b b c a Adding 3rd species 74
76 Parsimony Goal: explain data with min. no. of evolutionary changes ( mutations, or mismatches) Parsimony: S(T) Σ j Σ {v, u} E(T) {j: v j u j } Small parsimony problem : Input: leaf sequences + a leaf-labeled tree T Goal: Find ancestral sequences implying minimum no. of changes (most parsimonious) Large parsimony problem : Input: leaf sequences Gaol: Find a most parsimonious tree (topology, leaf labeling and internal seqs.) 75
77 Algorithm for the Small Parsimony Problem (Fitch `71) Initialization: scan T in post-order, assign: leaf vertex m: S m ={state at node m} internal node m with children l, r : S m = S l S r if S l S r =φ (i) S l S r o/w (ii) Solution Construction: scan tree in preorder, choose: for the root choose x S root at node m with child k: pick same state as k if possible; o/w - pick arbitrarily cost = no. of case (i) events. time: O(n k) (k = #states s) 76
78 Fitch s Alg T T,C T >C C A,T T >Α T C C A T T T 77
79 Walter Fitch May 21, March 10, 2011 One of the most influential evolutionary biologists in the world, who established a new scientific field: molecular phylogenetics. He was a member in the National Academy of Sciences, the American Academy of Arts and Sciences and the American Philosophical Society. He co-founded and was the first president of the Society for Molecular Biology, which established the annually awarded Fitch Prize. Additionally, he was a founding editor of Molecular Biology and Evolution. Fitch was at the University of California, Irvine, until his death, preceded by three years at the University of Southern California and 24 years University of Wisconsin Madison. 78
80 Weighted Parsimony k states S 1,, S k C ij = cost of changing from state i to j Algorithm (Sankoff-Cedergren 88): Need: S k (x) best cost for the subtree rooted at x if state at x is k For leaf x, S k (x) = 0 if state of x is k o/w Scan T in postorder. At node a with children l, r S k (a) = min m (S m (l)+ C mk ) + min m (S m (r)+ C mk ) Opt=min m (S m (root)) time: O(n k 2 ) 79
81 C G C C 80
82 David Sankoff Over the past 30 years, Sankoff formulated and contributed to many of the fundamental problems in computational biology. In sequence comparison, he introduced the quadratic version of the Needleman-Wunsch algorithm, developed the first statistical test for alignments, initiated the study of the limit behavior of random sequences with Vaclav Chvatal and described the multiple alignment problem, based on minimum evolution over a phylogenetic tree. In the study of RNA secondary structure, he developed algorithms based on general energy functions for multiple loops and for simultaneous folding and alignment, and performed the earliest studies of parametric folding and automated phylogenetic filtering. Sankoff and Robert Cedergren collaborated on the first studies of the evolution of the genetic code based on trna sequences. His contributions to phylogenetics include early models for horizontal transfer, a general approach for optimizing the nodes of a given tree, a method for rapid bootstrap calculations, a generalization of the nearest neighbor interchange heuristic, various constraint, consensus and supertree problems, the computational complexity of several phylogeny problems with William Day, and a general technique for phylogenetic invariants with Vincent Ferretti. Over the last fifteen years he has focused on the evolution of genomes as the result of chromosomal rearrangement processes. Here he introduced the computational analysis of genomic edit distances, including parametric versions, the distribution of gene numbers in conserved segments in a random model with Joseph Nadeau, phylogeny based on gene order with Mathieu Blanchette and David Bryant, generalizations to include multigene families, including algorithms for analyzing genome duplication and hybridization with Nadia El-Mabrouk, and the statistical analysis of gene clusters with Dannie Durand. Sankoff is also well known in linguistics for his methods of studying grammatical variation and 81 change in speech communities, the quantification of discourse analysis and production models of bilingual speech.
83 Large Parsimony Problem Input: n x m matrix M: M ij = state of j th character of species i. M i = label of i (all labels are distinct) Goal: Construct a phylogenetic tree T with n leaves and a label for each node, s.t. 1-1 correspondence of leaves and labels cost of tree is minimum. NP-hard 82
84 Branch & Bound (Hendy-Penny 89) enumerate all unrooted trees with increasing no. of leaves Note: cost of tree with all leaves cost of subtree with some leaves pruned (and same labeling) => If cost of subtree best cost for full tree so far, then: can prune (ignore) all refinements of the subtree. enumeration & pruning can be done in O(1) time per visited subtree. 83
85 Branch swapping C D C D A B B A Each internal edge defines 4 sub-trees: Can swap two such non-adjacent sub-trees 84
86 Nearest Neighbor Interchanges handles n-labeled trees T and T are neighbors if one can get T by following operation on T: S R S R U V S U V U R V SV RU SU RV SR UV Use the neighborhood structure on the set of solutions (all trees) via hill climbing, annealing, other heuristics... 85
87 NN interchange graph 86
88 Wilson & Cann, Sci Amer 92 87
89 FIN 88
Phylogeny Jan 5, 2016
גנומיקה חישובית Computational Genomics Phylogeny Jan 5, 2016 Slides: Adi Akavia Nir Friedman s slides at HUJI (based on ALGMB 98) Anders Gorm Pedersen,Technical University of Denmark Sources: Joe Felsenstein
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationEvolutionary Tree Analysis. Overview
CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationTree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny
More informationMolecular Evolution and Phylogenetic Tree Reconstruction
1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary
CSCI1950 Z Computa4onal Methods for Biology Lecture 4 Ben Raphael February 2, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary Parsimony Probabilis4c Method Input Output Sankoff s & Fitch
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationPhylogenetic trees 07/10/13
Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 5
CSCI1950 Z Computa4onal Methods for Biology Lecture 5 Ben Raphael February 6, 2009 hip://cs.brown.edu/courses/csci1950 z/ Alignment vs. Distance Matrix Mouse: ACAGTGACGCCACACACGT Gorilla: CCTGCGACGTAACAAACGC
More informationPhylogenetics. BIOL 7711 Computational Bioscience
Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationPhylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science
Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods
More informationHierarchical Clustering
Hierarchical Clustering Some slides by Serafim Batzoglou 1 From expression profiles to distances From the Raw Data matrix we compute the similarity matrix S. S ij reflects the similarity of the expression
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationPhylogeny: building the tree of life
Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationCS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003
CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 Lecturer: Wing-Kin Sung Scribe: Ning K., Shan T., Xiang
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationLecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationInferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution
Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationLecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22
Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and
More informationBMI/CS 776 Lecture 4. Colin Dewey
BMI/CS 776 Lecture 4 Colin Dewey 2007.02.01 Outline Common nucleotide substitution models Directed graphical models Ancestral sequence inference Poisson process continuous Markov process X t0 X t1 X t2
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationInferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies
Inferring Phylogenetic Trees Distance Approaches Representing distances in rooted and unrooted trees The distance approach to phylogenies given: an n n matrix M where M ij is the distance between taxa
More informationChapter 3: Phylogenetics
Chapter 3: Phylogenetics 3. Computing Phylogeny Prof. Yechiam Yemini (YY) Computer Science epartment Columbia niversity Overview Computing trees istance-based techniques Maximal Parsimony (MP) techniques
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationPhylogeny: traditional and Bayesian approaches
Phylogeny: traditional and Bayesian approaches 5-Feb-2014 DEKM book Notes from Dr. B. John Holder and Lewis, Nature Reviews Genetics 4, 275-284, 2003 1 Phylogeny A graph depicting the ancestor-descendent
More informationPhylogeny Tree Algorithms
Phylogeny Tree lgorithms Jianlin heng, PhD School of Electrical Engineering and omputer Science University of entral Florida 2006 Free for academic use. opyright @ Jianlin heng & original sources for some
More informationMichael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D
7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood
More informationMolecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço
Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationPlan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method
Phylogeny 1 Plan: Phylogeny is an important subject. We have 2.5 hours. So I will teach all the concepts via one example of a chain letter evolution. The concepts we will discuss include: Evolutionary
More informationLecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationPhylogenetic Networks, Trees, and Clusters
Phylogenetic Networks, Trees, and Clusters Luay Nakhleh 1 and Li-San Wang 2 1 Department of Computer Science Rice University Houston, TX 77005, USA nakhleh@cs.rice.edu 2 Department of Biology University
More informationSubstitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A
GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationTools and Algorithms in Bioinformatics
Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and
More informationDNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi
DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :
More informationPhylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches
Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell
More informationPage 1. Evolutionary Trees. Why build evolutionary tree? Outline
Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny
More informationMOLECULAR EVOLUTION AND PHYLOGENETICS SERGEI L KOSAKOVSKY POND CSE/BIMM/BENG 181 MAY 27, 2011
MOLECULAR EVOLUTION AND PHYLOGENETICS If we could observe evolution: speciation, mutation, natural selection and fixation, we might see something like this: AGTAGC GGTGAC AGTAGA CGTAGA AGTAGA A G G C AGTAGA
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36
More informationMolecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016
Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationTheory of Evolution. Charles Darwin
Theory of Evolution harles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (8-6) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationReconstruire le passé biologique modèles, méthodes, performances, limites
Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of
More informationAN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM
AN EXACT SOLVER FOR THE DCJ MEDIAN PROBLEM MENG ZHANG College of Computer Science and Technology, Jilin University, China Email: zhangmeng@jlueducn WILLIAM ARNDT AND JIJUN TANG Dept of Computer Science
More informationLecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).
1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff
More informationReading for Lecture 13 Release v10
Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................
More informationEarly History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species
Schedule Bioinformatics and Computational Biology: History and Biological Background (JH) 0.0 he Parsimony criterion GKN.0 Stochastic Models of Sequence Evolution GKN 7.0 he Likelihood criterion GKN 0.0
More informationPhylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University
Phylogenetics: Parsimony and Likelihood COMP 571 - Spring 2016 Luay Nakhleh, Rice University The Problem Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions
More informationSara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)
Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline
More informationmolecular evolution and phylogenetics
molecular evolution and phylogenetics Charlotte Darby Computational Genomics: Applied Comparative Genomics 2.13.18 https://www.thinglink.com/scene/762084640000311296 Internal node Root TIME Branch Leaves
More informationThe Generalized Neighbor Joining method
The Generalized Neighbor Joining method Ruriko Yoshida Dept. of Mathematics Duke University Joint work with Dan Levy and Lior Pachter www.math.duke.edu/ ruriko data mining 1 Challenge We would like to
More informationA Phylogenetic Network Construction due to Constrained Recombination
A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer
More informationHow to read and make phylogenetic trees Zuzana Starostová
How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation
More informationAlgorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,
Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationA Minimum Spanning Tree Framework for Inferring Phylogenies
A Minimum Spanning Tree Framework for Inferring Phylogenies Daniel Giannico Adkins Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-010-157
More informationRecent Advances in Phylogeny Reconstruction
Recent Advances in Phylogeny Reconstruction from Gene-Order Data Bernard M.E. Moret Department of Computer Science University of New Mexico Albuquerque, NM 87131 Department Colloqium p.1/41 Collaborators
More informationInferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT
Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions
More information(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise
Bot 421/521 PHYLOGENETIC ANALYSIS I. Origins A. Hennig 1950 (German edition) Phylogenetic Systematics 1966 B. Zimmerman (Germany, 1930 s) C. Wagner (Michigan, 1920-2000) II. Characters and character states
More information3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT
3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode
More information66 Bioinformatics I, WS 09-10, D. Huson, December 1, Evolutionary tree of organisms, Ernst Haeckel, 1866
66 Bioinformatics I, WS 09-10, D. Huson, December 1, 2009 5 Phylogeny Evolutionary tree of organisms, Ernst Haeckel, 1866 5.1 References J. Felsenstein, Inferring Phylogenies, Sinauer, 2004. C. Semple
More informationX X (2) X Pr(X = x θ) (3)
Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree
More informationarxiv: v1 [q-bio.pe] 1 Jun 2014
THE MOST PARSIMONIOUS TREE FOR RANDOM DATA MAREIKE FISCHER, MICHELLE GALLA, LINA HERBST AND MIKE STEEL arxiv:46.27v [q-bio.pe] Jun 24 Abstract. Applying a method to reconstruct a phylogenetic tree from
More informationCS5263 Bioinformatics. Guest Lecture Part II Phylogenetics
CS5263 Bioinformatics Guest Lecture Part II Phylogenetics Up to now we have focused on finding similarities, now we start focusing on differences (dissimilarities leading to distance measures). Identifying
More informationCopyright 2000 N. AYDIN. All rights reserved. 1
Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment
More informationProperties of normal phylogenetic networks
Properties of normal phylogenetic networks Stephen J. Willson Department of Mathematics Iowa State University Ames, IA 50011 USA swillson@iastate.edu August 13, 2009 Abstract. A phylogenetic network is
More informationPhylogenetics: Likelihood
1 Phylogenetics: Likelihood COMP 571 Luay Nakhleh, Rice University The Problem 2 Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions 3 Characters are mutually
More informationPhylogenetic analyses. Kirsi Kostamo
Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationNJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees
NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana
More informationA PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS
A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More informationarxiv: v5 [q-bio.pe] 24 Oct 2016
On the Quirks of Maximum Parsimony and Likelihood on Phylogenetic Networks Christopher Bryant a, Mareike Fischer b, Simone Linz c, Charles Semple d arxiv:1505.06898v5 [q-bio.pe] 24 Oct 2016 a Statistics
More informationBioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony
ioinformatics -- lecture 9 Phylogenetic trees istance-based tree building Parsimony (,(,(,))) rees can be represented in "parenthesis notation". Each set of parentheses represents a branch-point (bifurcation),
More informationThanks to Paul Lewis and Joe Felsenstein for the use of slides
Thanks to Paul Lewis and Joe Felsenstein for the use of slides Review Hennigian logic reconstructs the tree if we know polarity of characters and there is no homoplasy UPGMA infers a tree from a distance
More informationProbabilistic modeling and molecular phylogeny
Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical
More information