I wonder about Trees Robert Frost

Similar documents
Trees and Human Evolution

C.DARWIN ( )

Phylogenetic trees 07/10/13

Evolutionary Tree Analysis. Overview

Algorithms in Bioinformatics

Theory of Evolution Charles Darwin

Inferring Phylogenetic Trees. Distance Approaches. Representing distances. in rooted and unrooted trees. The distance approach to phylogenies

EVOLUTIONARY DISTANCES

BINF6201/8201. Molecular phylogenetic methods

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Multiple Sequence Alignment. Sequences

Evolutionary Trees. Evolutionary tree. To describe the evolutionary relationship among species A 3 A 2 A 4. R.C.T. Lee and Chin Lung Lu

Phylogenetic Tree Reconstruction

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

A (short) introduction to phylogenetics

Tree Building Activity

Molecular Evolution and Phylogenetic Tree Reconstruction

Constructing Evolutionary/Phylogenetic Trees

What is Phylogenetics

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

Evidence of Evolution

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method

COMP 355 Advanced Algorithms

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Using Bioinformatics to Study Evolutionary Relationships Instructions

COMP 355 Advanced Algorithms Algorithm Design Review: Mathematical Background

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

Phylogenetic inference

Gel Electrophoresis. 10/28/0310/21/2003 CAP/CGS 5991 Lecture 10Lecture 9 1

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Data Structures and Algorithms

Theory of Evolution. Charles Darwin

RECOVERING NORMAL NETWORKS FROM SHORTEST INTER-TAXA DISTANCE INFORMATION

Phylogenetic inference: from sequences to trees

Phylogeny: building the tree of life

4/25/2009. Xi (Cici) Chen Texas Tech University. Ancient: folk taxonomy impulse to classify organisms Carl Linnaenus: binominal nomenclature (1735)

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

6.001 SICP September? Procedural abstraction example: sqrt (define try (lambda (guess x) (if (good-enuf? guess x) guess (try (improve guess x) x))))

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Hierarchical Clustering


Estimating Evolutionary Trees. Phylogenetic Methods

Warm-Up- Review Natural Selection and Reproduction for quiz today!!!! Notes on Evidence of Evolution Work on Vocabulary and Lab

Constructing Evolutionary Trees

Methods for solving recurrences

Tree-average distances on certain phylogenetic networks have their weights uniquely determined

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogenetics: Building Phylogenetic Trees

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

Dictionary: an abstract data type

Assignment 5: Solutions

Data Structures and Algorithms " Search Trees!!

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Dictionary: an abstract data type

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

THE THEORY OF EVOLUTION. Darwin, the people who contributed to his ideas, and what it all really means.

Statistical inference provides methods for drawing conclusions about a population from sample data.

Advanced Analysis of Algorithms - Midterm (Solutions)

Phylogeny: traditional and Bayesian approaches

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

week: 4 Date: Microscopes Cell Structure Cell Function Standards None 1b, 1h 1b, 1h, 4f, 5a 1a, 1c, 1d, 1e, 1g, 1j

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Darwin & Evolution by Natural Selection

Building Phylogenetic Trees UPGMA & NJ

Evidences of Evolution (Clues)

Skip Lists. What is a Skip List. Skip Lists 3/19/14

SEVENTH EDITION and EXPANDED SEVENTH EDITION

How to read and make phylogenetic trees Zuzana Starostová

Properties of normal phylogenetic networks

Evolution of Body Size in Bears. How to Build and Use a Phylogeny

How should we organize the diversity of animal life?

Incremental Phylogenetics by Repeated Insertions: An Evolutionary Tree Algorithm

Warm Up. Explain how a mutation can be detrimental in one environmental context and beneficial in another.

Phylogenetics. BIOL 7711 Computational Bioscience

BIOL 1010 Introduction to Biology: The Evolution and Diversity of Life. Spring 2011 Sections A & B

Phylogenetic analyses. Kirsi Kostamo

Change Over Time Concept Map

Lecture 6 Phylogenetic Inference

molecular evolution and phylogenetics

Biology 1B Evolution Lecture 2 (February 26, 2010) Natural Selection, Phylogenies

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

At least one of us is a knave. What are A and B?

Reconstruire le passé biologique modèles, méthodes, performances, limites

Properties of Life. Levels of Organization. Levels of Organization. Levels of Organization. Levels of Organization. The Science of Biology.

2.1 Computational Tractability. Chapter 2. Basics of Algorithm Analysis. Computational Tractability. Polynomial-Time

2. ALGORITHM ANALYSIS

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Reading for Lecture 13 Release v10

Consistency Index (CI)

Session 5: Phylogenomics

Comparative Genomics II

Fig. 26.7a. Biodiversity. 1. Course Outline Outcomes Instructors Text Grading. 2. Course Syllabus. Fig. 26.7b Table

Transcription:

I wonder about Trees Robert Frost We wonder about Robert Frost - Trees Hey! How come no turtles in this tree?

Phylogenetic Trees From Darwin s notebooks, 1837

Really?

The only figure in On the Origin of Species by Natural Selection (Darwin, 1859) From the 6 th edition (1872): The affinities of all the beings of the same class have sometimes been represented by a great tree. I believe this simile largely speaks the truth. The green and budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species.

More recursion on trees: mrca West Dorm Groody ( W ) >>> find( L, Groodies) True X Y Z East Dorm Groody ( E ) Linde Dorm Groody ( L ) Case Groody ( C ) >>> mrca("l", "E", Groodies) # use find as a helper! 'Z' >>> mrca("w", "E", Groodies) 'Y' >>> mrca("w", "C", Groodies) 'X' >>> mrca("w", "Prof. Wu", Groodies) None

mrca X Y Z West Dorm Groody ( W ) East Dorm Groody ( E ) Linde Dorm Groody ( L ) def mrca(species1,species2,tree): """Return the name of the most recent commmon ancestor of species1 and species2. If there isn't one, return None.""" Case Groody ( C ) Notes

mrca X Y Z West Dorm Groody ( W ) East Dorm Groody ( E ) Linde Dorm Groody ( L ) def mrca(species1,species2,tree): """Return the name of the most recent commmon ancestor of species1 and species2. If there isn't one, return None.""" if Tree[1] == (): return None elif not find(species1,tree) or not find(species2,tree): return None else: if find(species1, Tree[1]) and find(species2, Tree[1]): return mrca(species1,species2, Tree[1]) elif find(species1, Tree[2]) and find(species2, Tree[2]): return mrca(species1,species2, Tree[2]) else: return Tree[0] Case Groody ( C )

Phylogenetic Reconstruction Input: DNA or protein sequences for each species Output: Species tree D. Desjardin, K. Peay, T. Bruns, Mycologia 103(5), 2011

Distance-Based Approach Groody CATCAACCAGTGACCAGTATAGGACGCCC Froody CAACACTCAGTGACAAGTCTAGCACGCCC Snoody AATCGCCCGGCGTCAGGCATAGCTAGCGC G F S G 0 6 12 F 6 0 12 S 12 12 0

Distance-Based Approach Current time ( time 0 ) Snoody G F S G 0 6 12 F 6 0 12 S 12 12 0 6.0 3 Groody Froody (6.0, ('Snoody', (), ()), (3.0, ('Groody', (), ()), ('Froody', (), ()) ) )

Unweighted Pair Group Method with Arithmetic Mean (UPGMA) Algorithm Groody Froody Snoody G F S G 0 6 12 F 6 0 12 S 12 12 0

Try This One Aoody Boody Coody Doody Eoody A B C D E A 0 B 4 0 C 16 16 0 D 16 16 8 0 E 16 16 8 6 0 This algorithm s bark is worse than it s bite. The matrix is symmetric, so we just need to keep the bottom or top half! Worksheet

DEMO! I wood love to see this in action!

Implementing UPGMA 1 Groody 2 Froody 3 Snoody 1 2 3 1 0 6 12 2 6 0 12 3 12 12 0 Let s see this in the provided mitodata.py file N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) groodieslist = [N1, N2, N3] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0}

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) groodieslist = [N1, N2, N3] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 1. For all pairs of trees in the groodieslist, find the closest pair

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) groodieslist = [N1, N2, N3] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 1. For all pairs of trees in the groodieslist, find the closest pair N1 = ("Groody", (), ()) N2 = ("Froody", (), ())

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) groodieslist = [N1, N2, N3] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 1. For all pairs of trees in the groodieslist, find the closest pair N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) 2. Remove those from the groodieslist groodylist.remove(n1) groodylist.remove(n2)

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) groodieslist = [N1, N2, N3] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 1. For all pairs of trees in the groodieslist, find the closest pair N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) 2. Remove those from the groodieslist 3. Make a new tree by joining these two trees

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) groodieslist = [N1, N2, N3] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 1. For all pairs of trees in the groodieslist, find the closest pair N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) 2. Remove those from the groodieslist 3. Make a new tree by joining these two trees (3.0, N1, N2) = (3.0, ("Groody", (), ()), ("Froody", (), ()))

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) N1 N2 groodieslist = [N3, (3.0, ("Groody", (), ()), ("Froody", (), ()))] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 1. For all pairs of trees in the groodieslist, find the closest pair N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) 2. Remove those from the groodieslist 3. Make a new tree by joining these two trees (3.0, N1, N2) = (3.0, ("Groody", (), ()), ("Froody", (), ())) 4. Add this new tree to the groodieslist

N1 = ("Groody", (), ()) N2 = ("Froody", (), ()) N3 = ("Snoody", (), ()) N1 N2 groodieslist = [N3, (3.0, ("Groody", (), ()), ("Froody", (), ()))] groodiesmatrix = {(N1, N1):0, (N1, N2):6, (N1, N3):12, (N2, N1):6, (N2, N2):0, (N2, N3):12, (N3,N1):12, (N3,N2):12, (N3,N3):0} 5. Update the distance matrix We are still holding on to N1 and N2, even though we have removed them from the groodieslist!

Implementing UPGMA findclosestpair( specieslist, Distances ): Takes a list of species trees and the distance dictionary as input and returns a tuple (X, Y) where X and Y are in the list and have the minimum distance between any two items in the list. updatedist( specieslist, Distances, newtree): Takes a list of species trees, the distance dictionary, and a newtree that was just formed by merging two trees found by findclosestpair. Those two trees can be found by looking inside newtree. Those two trees are removed from the distance dictionary and the newtree is added to the dictionary. upgma( specieslist, Distances): returns the phylogenetic tree constructed by the UPGMA algorithm findclosestpair: 7 lines updatedist: 8 lines upgma: 12 lines

Updating the distance dictionary: a subtlety a b c d a 0 b 4 0 c 6 6 0 d 12 12 9 0 a,b c d a,b 0 c 6 0 d 12 9 0 a,b,c d a,b,c 0 d??? 0 3 2 c a b

Updating the distance dictionary: a subtlety a b c d a 0 b 4 0 c 6 6 0 d 12 12 9 0 a,b c d a,b 0 c 6 0 d 12 9 0 a,b,c d a,b,c 0 d 11 0 When we combine distances between nodes with different numbers of leaves, we should weight by the number of leaves. 2 leaves 1 leaf = 3 total a,b c 2/3 * 12 + 1/3 * 9 = 11 3 2 c a b

Updating the distance dictionary: a subtlety a b c d a 0 b 4 0 c 6 6 0 d 12 12 9 0 a,b c d a,b 0 c 6 0 d 12 9 0 a,b,c d a,b,c 0 d 11 0 When we combine distances between nodes with different numbers of leaves, we should weight by the number of leaves. 2 leaves 1 leaf = 3 total a,b c 2/3 * 12 + 1/3 * 9 = 11 3 2 c a b newtree = ( Anc, T1, T2) Distance(newTree, T3) = Distance(T1, T3) x leafcount (T1)/leafCount(newTree) + Distance(T2, T3) x leafcount(t2)/leafcount(newtree)

Inferring time 3 A B 10 4.5 C D The scale function you write in lab will be useful

Rhea Ostrich https://en.wikipedia.org/wiki/rhea_(bird) https://en.wikipedia.org/wiki/ostrich https://en.wikipedia.org/wiki/gondwana

UPGMA assumes a molecular clock G F S G 0 6 12 F 6 0 12 S 12 12 0

A note on biogeography/migrations