Lecture 6 Phylogenetic Inference
From Darwin s notebook in 1837
Charles Darwin Willi Hennig From The Origin in 1859 Cladistics
Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group, Natural group a. All individuals in the clade derived from a single ancestor b. This ancestor s descendants are all in the clade
Monophyletic groups Lungfishes Sarcopterygians Fishes Tetrapods Fishes Coelacanths Ancestor
Phylogenetic inference Definitions: 2. Ancestral v.s. Derived characters A B C D
Phylogenetic inference Definitions: apomorphy: derived character 3. Synapomorphy: Shared derived character A B C D apomorphy synapomorphy
Phylogenetic inference Definitions: 4. Reversal evolution
Phylogenetic inference 5. Homoplasy, Convergent evolution Fossa, Madagascar Mongoose Mountain Lion, California, Cat Thylacine, Tasmania Marsupial
6. Parallel evolution Phylogenetic inference
Phylogenetic Inference phylogenetic trees are built from characters.
Phylogenetic Inference phylogenetic trees are built from characters. characters can be morphological, behavioral, physiological, or molecular.
Phylogenetic Inference phylogenetic trees are built from characters. characters can be morphological, behavioral, physiological, or molecular. there are two important assumptions about the characters used to build trees:
Phylogenetic Inference phylogenetic trees are built from characters. characters can be morphological, behavioral, physiological, or molecular. there are two important assumptions about the characters used to build trees: 1. they are independent.
Phylogenetic Inference phylogenetic trees are built from characters. characters can be morphological, behavioral, physiological, or molecular. there are two important assumptions about characters used to build trees: 1. they are independent. 2. they are homologous.
What is a homologous character?
What is a homologous character? a homologous character is shared by two species because it was inherited from a common ancestor.
What is a homologous character? a homologous character is shared by two species because it was inherited from a common ancestor. a character possessed by two species but was not present in their recent ancestors, it is said to exhibit homoplasy.
Types of homoplasy:
Types of homoplasy: 1. Convergent evolution Example: evolution of eyes, flight.
Examples of convergent evolution
Convergent evolution between placental and marsupial mammals
Types of homoplasy: 1. Convergent evolution Example: evolution of eyes, flight. 2. Parallel evolution Example: lactose tolerance in humans.
What is the difference between convergent and parallel evolution?
What is the difference between convergent and parallel evolution? Convergent Parallel
What is the difference between convergent and parallel evolution? Convergent Parallel Species compared distantly closely related related
What is the difference between convergent and parallel evolution? Convergent Parallel Species compared distantly closely related related Trait produced by different genes/ same genes/ developmental developmental pathways pathways
Types of homoplasy: 1. Convergent evolution Example: evolution of eyes, flight. 2. Parallel evolution Example: lactose tolerance in human adults 3. Evolutionary reversals Example: back mutations at the DNA sequence level (C A C).
Phylogenetic reconstructions 1. Phenetics (Neighbor - Joining) 2. Cladistics (Maximum Parsimony) 3. Statistics (Maximum Likelihood)
Phylogenetic reconstructions Phenetics (Distance Methods) A ATGTTGCCA A B C D * A B AAGTTGCCA B 1 ***** C 4 5 C ATCAACCCA D 7 8 4 * ** D CTCAACTTA
Phylogenetic reconstructions Phenetics (Distance Methods) A B C D A B C D 1 4 5 7 8 4 0.5 1.75 (A,B)=1 (A,B)C=(4+5)/2=4.5 (A,B)D=(7+8)/2=7.5 (A,B,C)D=(7+8+4)/3=6.3 A B C D 2.25 3.15 0.9
Phylogenetic reconstructions Cladistics: Maximum Parsimony Method A B C D G G A A G A 1 step G A C B D G A G A A D B C G A G A G A G A 3 steps G 3 steps G
Phylogenetic reconstructions Cladistics: Maximum Parsimony Number of possible rooted trees Number of taxa Number of Number of rooted trees unrooted trees 4 15 3 7 10,395 954 10 34,459,425 2,027,025
How do we select the best tree? No. of Taxa No. of possible trees 4 3 5 15 6 105 7 945 10 2 x 10 6 11 34 x 10 6 50 3 x 10 74
Independent gain of camera eye requires two changes
Evolution and loss of camera eye requires six changes
Phylogenetic reconstructions Phenetics (Distance Methods) - what are the principles pheneticists use to construct phylogenies? 1. tree should reflect overall degree of similarity. 2. tree should be based on as many characters as possible. 3. tree should minimize the distance between taxa.
Phylogenetic reconstructions Cladistics 1. tree should reflect the true phylogeny. 2. phylogeny should be based on characters that are shared (by more than one taxon) and derived (from some known ancestral state). 3. the ancestral state of characters are inferred from an outgroup that roots the tree. - an outgroup is ideally picked from fossil evidence - i.e., it belongs to a genus or family that existed prior to taxa forming the ingroup.
Each subspecies of seaside sparrow has a restricted range. maritima Atlantic coast junicola macgillivraii Gulf coast nigrescens fisheri peninsulae
The subspecies separate into two groups when DNA sequences are compared. maritima macgillivraii nigrescens Atlantic coast peninsulae junicola fisheri Gulf coast
How do distance trees differ from cladograms? Distance trees Cladograms Characters used as many as synapomorphies possible only Monophyly not required absolute requirement Emphasis branch lengths branch-splitting Outgroup not required absolute requirement
Phylogenetic reconstructions 3. Statistics (Maximum Likelihood) The only method based on a mutation model!
Phylogenetic reconstructions 3. Maximum Likelihood α A G α α α α pan = 3α C T α Jukes-Cantor Model
Phylogenetic reconstructions 3. Maximum Likelihood α A G α A G α α α α β β β β C T α Jukes-Cantor Model C T α Kimura - 2 parameter Model
Phylogenetic reconstructions 3. Maximum Likelihood α A G pan = α + 2β β β β β C T α Kimura - 2 parameter Model
Infer relationships among three species: Outgroup:
Markov chain Monte Carlo 1. Start at an arbitrary point 2. Make a small random move 3. Calculate height ratio (r) of new state to old state: 1. r > 1 -> new state accepted 2. r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state 4. Go to step 2 always accept 2a 1 2b accept sometimes 20 % 48 % 32 % The proportion of time the MCMC procedure samples from a particular parameter region is an estimate of that region s posterior probability density tree 1 tree 2 tree 3
Markov chain Monte Carlo 1. Start at an arbitrary point 2. Make a small random move 3. Calculate height ratio (r) of new state to old state: 1. r > 1 -> new state accepted 2. r < 1 -> new state accepted with probability r. If new state not accepted, stay in the old state 4. Go to step 2 always accept 2a 1 2b accept sometimes 20 % 48 % 32 % The proportion of time the MCMC procedure samples from a particular parameter region is an estimate of that region s posterior probability density tree 1 tree 2 tree 3
Phylogenetic reconstructions 1. Phenetics (Neighbor - Joining) 2. Cladistics (Maximum Parsimony) 3. Statistics (Maximum Likelihood)
Phylogenetic Inference Two points to keep in mind: 1. Phylogenetic trees are hypotheses 2. Gene trees are not the same as species trees a species tree depicts the evolutionary history of a group of species. a gene tree depicts the evolutionary history of a specific locus.
Conflict between gene trees and species trees
Conflict between gene trees and species trees
How do we select the best tree?
Evaluating tree support by bootstrapping
Evaluating tree support by bootstrapping Species 1 A A C G C C T G Species 2 A T C G C C T G Species 3 A T T G A C C G Species 4 A T T G A C C G
Evaluating tree support by bootstrapping Species 1 A A C G C C T G Species 2 A T C G C C T G Species 3 A T T G A C C G Species 4 A T T G A C C G Species 1 Species 2 Species 3 Species 4
Evaluating tree support by bootstrapping Species 1 A A C G C C T G Species 2 A T C G C C T G Species 3 A T T G A C C G Species 4 A T T G A C C G Step 1. Randomly select a base to represent position 1
Evaluating tree support by bootstrapping Species 1 A A C G C C T G Species 2 A T C G C C T G Species 3 A T T G A C C G Species 4 A T T G A C C G Step 1. Randomly select a base to represent position 1 Species 1 T Species 2 T Species 3 C Species 4 C
Evaluating tree support by bootstrapping Species 1 A A C G C C T G Species 2 A T C G C C T G Species 3 A T T G A C C G Species 4 A T T G A C C G Step 2. Randomly select a base to represent position 2 Species 1 T G Species 2 T G Species 3 C G Species 4 C G
Evaluating tree support by bootstrapping Step 3. Generate complete data set (sampling with replacement).
Evaluating tree support by bootstrapping Step 3. Generate complete data set (sampling with replacement). Step 4. Build tree and record if groupings match original tree.
Evaluating tree support by bootstrapping Step 3. Generate complete data set (sampling with replacement). Step 4. Build tree and record if groupings match original tree. Step 5. Repeat 1,000 times.
Evaluating tree support by bootstrapping 98 Species 1 Species 2 92 Species 3 Species 4
Cospeciation of aphids and their bacterial endosymbionts