Questions we can ask. Recall. Accuracy and Precision. Systematics - Bio 615. Outline
|
|
- Ruby Sparks
- 5 years ago
- Views:
Transcription
1 Outline 1. Mechanistic comparison with Parsimony - branch lengths & parameters 2. Performance comparison with Parsimony - Desirable attributes of a method - The Felsenstein and Farris zones - Heterotachous data Derek S. Sikes University of Alaska 3. Confidence - Assessment (part 1): CI, consensus trees Confidence - Assessment of the Strength of Questions we can ask Are the data better than random - do they have signal? How much homoplasy is there? To what extent are particular elements of the trees (clades) supported? What alternative results can we reject? Do independent data sets corroborate or conflict with each other? Recall Stochastic error vs Systematic error These assessment methods help identify stochastic error How repeatable are the results? How strongly do the data support them? This is a measure of precision (which is hopefully related to accuracy) Accuracy and Precision Accuracy Accuracy is correctness. How close a measurement is to the true value. "" "(unless we know the true tree in "" "advance we cannot measure this)" Precision Precision is reproducibility. How closely two or more measurements agree with one another. (this we can measure!) 1
2 Recall Stochastic error vs Systematic error High accuracy High precision High accuracy Low precision All methods have assumptions - when violated they can produce systematic error Low accuracy High precision Low accuracy Low precision Confidence measures cannot detect systematic error - must use other methods to identify (compare methods that have different assumptions) Branch Support Measures Precision - Not Accuracy* - Random error +/- gone with huge dataset of 124,026 characters - Systematic error evident in ME analysis (tree on right) (even with corrected distance data!) - 100% branch support values indicate no stochastic error In other words (keep in mind) These methods may show a high precision but the tree can still be wrong due to systematic error and These methods may show a low precision but the tree can still be correct * Except possibly Bayesian posterior probabilities 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jacknifing 6. Posterior probability (see lecture on Bayesian) 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jacknifing 6. Posterior probability (see lecture on Bayesian) 2
3 Parsimony - tree scores are integers - often leads to many equally most-parsimonious trees e.g. 27,000 MPTs all length = 25 In contrast, log-likelihoods are real numbers and rarely are two different trees found with equal log-likelihoods e.g. 1 tree of -lnl = next best tree of -lnl = This leads to different approaches in assessing the strength of the phylogenetic signal for MP vs ML analyses Consistency Indices - interesting but less useful than other methods PTP-test, g1-statistic - rarely used Consensus trees - summary tree of all MPTs - more often used for MP than ML - also used for Bayesian Consistency Indices If all the characters have the same signal then the tree is more trustworthy The more agreement there is, the less homoplasy (more consistency) the characters will show on the most parsimonious tree We need statistics to measure consistency CI - Kluge & Farris 1969 How much homoplasy is there? Taxon 1 A C A T T T A Taxon 2 A C G A T T A Taxon 3 A G G A T A G Taxon 4 G A A A A C? Taxon 5 G A T A? C G ObsL Min L Minimum length overall = 10 Length of MP tree = 11 Consistency Index (C.I.) = minimum number of changes required by data set number of changes on tree Higher CI means lower homoplasy CI value for tree or character Consistency index CI = Min L = 10 = 0.91 Obs L 11 Homoplasy index HI = 1-CI =
4 How much homoplasy is there? MacClade 4.0 Characters colored by their CI Red = CI of 1.0 (change once) Blue = CI <1.0 (change > 1 ) Taxon 1 A C A T T T A Taxon 2 A C G A T T A Taxon 3 A G G A T A G Taxon 4 G A A A A C? Taxon 5 G A T A? C G ObsL Min L CI Tree number 1 (rooted using user-specified outgroup)! Tree length = 405! Consistency index (CI) = ! Homoplasy index (HI) = ! CI excluding uninformative characters = ! HI excluding uninformative characters = ! Retention index (RI) = ! Rescaled consistency index (RC) = ! /-- nepalensis2! / nepalensis3! / / nepalensis6! \---+ nepalensis12! / / podagricus2! \ podagricus1! / melissae2! / /+ melissae3! / melissae1! \ / quadripuncta1! / \ quadripuncta2! / trumboi2! \ trumboi1! /---+ /- maculifrons2! / maculifrons3! \ montivagus2! / sayi1! sayi2! /---- humator2! \ humator3! Retention Index Taxon 1 A C A T T T A Taxon 2 A C G A T T A Taxon 3 A G G A T A G Taxon 4 G A A A A C? Taxon 5 G A T A? C G Min L Max L Maximum length overall = 15 Retention index (RI) = MaxL - ObsL = = 4 = 0.80 MaxL - MinL = Farris (1989) - improvements over the CI - Downweights homoplastic characters - Excludes autapomorphies - Goes to 0.0 if Max change = Observed change (CI doesn t go to 0.0, hard to interpret) General trends observed with CI/RI s Strong negative correlation between taxon number and CI and RI Data sets with few characters can show unexpected high CI and RI Not a very reliable measure of strength of signal 4
5 How can we evaluate the significance of CI/RI? Permuting data removes phylogenetic signal CI depends directly on tree length We can compare the observed tree length with what we would obtain if there were no phylogenetic signal A permutation tail probability (PTP) tests the proportion of permuted data sets with as good or better measure of quality than the real data Taxon 1!ACATTTA! Taxon 2!ACGATTA! Taxon 3!AGGATAG! Taxon 4!GAAAAC?! Taxon 5!GATA?CG! Randomize states within a character Permuted data sets Taxon 1!GAAA?AA! Taxon 2!ACAATC?! Taxon 3!GAGTATG! Taxon 4!AGTATCG! Taxon 5!ACGATTA! PTP test in PAUP* permutation test = PTP! 1000 permutation test replicates completed! Time used = 5.83 sec! Results of PTP test:! Number of! Tree length replicates! ! 379* 1! 410 1! 411 1! 412 3! ! 414 8! ! ! ! ! ! ! !!! Number of! Tree length replicates! !! ! ! ! ! ! ! 428 2! 429 1! * = length for original! (unpermuted) data! P = ! Example without signal!!!number of!!!!number of! Tree length replicates! Tree length replicates! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 1938* 11! The permuted data are ! better than the real data! A data set without signal g1 statistic - a measure of skewness, more skew = more signal bell curve = random, noisy data, weak to no signal mean= sd= g1= ! / ! (5)! # (25)! ### (71)! ######### (209)! ####### (161)! ####################### (521)! ####################################### (883)! ################################################## (1132)! ################################################################# (1469)! ################################### (788)! ######################################################################## (1631)! ################################################################## (1486)! ############################################## (1047)! ######################### (567)! ####### (157)! ######## (171)! ### (57)! (11)! (3)! (1)! \ ! A data set with signal g1 statistic - a measure of skewness, more skew = more signal bell curve = random, noisy data, weak to no signal mean= sd= g1= ! / ! # (15)! ## (60)! ### (84)! ##### (135)! # (21)! # (26)! ### (96)! ###### (166)! ########## (290)! ########################## (737)! ######################################## (1118)! ######################## (665)! #### (120)! ########## (268)! ################## (497)! ############################ (796)! ############################################### (1337)! ######################################################################## (2031)! ######################################################### (1610)! ########### (323)! \ ! 5
6 / (1) 380 (3) 381 (1) 382 (5) 383 (4) 384 (5) 385 (7) 386 (8) 387 (15) 388 (19) 389 (20) 390 (22) 391 (20) 392 # (40) 393 # (51) 394 # (46) 395 # (58) 396 ## (78) 397 ## (79) 398 ## (97) 399 ## (110) 400 ## (112) 401 ### (148) 402 ### (162) 403 #### (170) 404 #### (211) 405 ##### (228) 406 ##### (256) 407 ###### (291) 408 ####### (307) 409 ####### (312) 410 ######## (374) 411 ######### (403) 412 ########## (492) 413 ########### (526) 414 ############ (552) 415 ############# (628) 416 ############### (715) 417 ################# (779) 418 #################### (928) 419 ##################### (971) 420 ###################### (1024) 421 ######################## (1108) 422 ######################### (1165) 423 ############################# (1365) 424 ################################ (1507) 425 #################################### (1691) 426 ##################################### (1742) 427 ########################################## (1960) 428 ########################################## (1958) 429 ############################################# (2107) 430 ############################################## (2178) 431 #################################################### (2451) 432 ####################################################### (2603) 433 ######################################################## (2648) 434 ############################################################ (2810) 435 ################################################################ (3007) 436 ################################################################# (3050) 437 ############################################################### (2971) 438 ################################################################# (3038) 439 ################################################################## (3112) 440 ################################################################### (3131) 441 ###################################################################### (3265) 442 ################################################################### (3128) 443 ####################################################################### (3326) 444 ########################################################################## (3475) 445 ############################################################################# (3616) 446 ############################################################################ (3566) 447 ############################################################################ (3573) 448 ############################################################################## (3661) 449 ############################################################################## (3644) 450 ############################################################################ (3567) 451 ############################################################################# (3632) 452 ############################################################################# (3616) 453 ############################################################################ (3554) 454 ###################################################################### (3274) 455 #################################################################### (3202) 456 ################################################################# (3074) 457 ############################################################## (2902) 458 ################################################################# (3056) 459 ############################################################### (2947) 460 ########################################################## (2739) 461 ################################################## (2358) 462 ############################################## (2181) 463 ########################################### (2026) 464 #################################### (1678) 465 ############################ (1322) 466 ##################### (966) 467 ################# (776) 468 ########## (488) 469 ####### (307) 470 #### (187) 471 ## (86) 472 # (39) 473 (12) 474 (11) 475 (1) \ Systematics - Bio 615 Frequency distribution of tree scores: mean= sd= g1= g2= Tests for phylogenetic signal (g1 and PTP) Are sensitive to any signal in the data For example g1 of permuted data = (ns) Duplicate one taxon and g1 = -1.56** Useful for identifying truly useless data (very rare) But otherwise does not tell you much about data quality Thus, not in your text or Page & Holmes (1998) Consensus & branch support CI & PTP methods seek to determine overall data quality as a guide to whether we should believe particular results We can, instead, evaluate particular results Clade support measures: bootstrap/decay Statistical tests of alternative hypotheses Terms - from lecture & readings precision accuracy consistency index g1 statistic homoplasy index retention index PTP test Study questions What do we mean when we say a method relaxes an assumption? [Compare how the JC69 and more complex models (eg K2P, HKY, or GTR) treat the Ts/Tv ratio parameter.]" Why is Quantifying the uncertainty of a phylogenetic estimate at least as important a goal as obtaining the phylogenetic estimate itself.? Do assessment methods like bootstrapping attempt to measure accuracy or precision? Both stochastic and systematic error can affect accuracy and precision - How can we can minimize one of these types of error? And by doing so what can we maximize - accuracy or precision?" The g1 statistic and the PTP test are not often used for assessment - what is that they can tell us about our data?" 6
Pinvar approach. Remarks: invariable sites (evolve at relative rate 0) variable sites (evolves at relative rate r)
Pinvar approach Unlike the site-specific rates approach, this approach does not require you to assign sites to rate categories Assumes there are only two classes of sites: invariable sites (evolve at relative
More informationSystematics - Bio 615
Bayesian Phylogenetic Inference 1. Introduction, history 2. Advantages over ML 3. Bayes Rule 4. The Priors 5. Marginal vs Joint estimation 6. MCMC Derek S. Sikes University of Alaska 7. Posteriors vs Bootstrap
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More information(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise
Bot 421/521 PHYLOGENETIC ANALYSIS I. Origins A. Hennig 1950 (German edition) Phylogenetic Systematics 1966 B. Zimmerman (Germany, 1930 s) C. Wagner (Michigan, 1920-2000) II. Characters and character states
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony
More informationConsensus Methods. * You are only responsible for the first two
Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is
More informationEvaluating phylogenetic hypotheses
Evaluating phylogenetic hypotheses Methods for evaluating topologies Topological comparisons: e.g., parametric bootstrapping, constrained searches Methods for evaluating nodes Resampling techniques: bootstrapping,
More informationPhylogenetics: Parsimony
1 Phylogenetics: Parsimony COMP 571 Luay Nakhleh, Rice University he Problem 2 Input: Multiple alignment of a set S of sequences Output: ree leaf-labeled with S Assumptions Characters are mutually independent
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationPhylogenetics: Parsimony and Likelihood. COMP Spring 2016 Luay Nakhleh, Rice University
Phylogenetics: Parsimony and Likelihood COMP 571 - Spring 2016 Luay Nakhleh, Rice University The Problem Input: Multiple alignment of a set S of sequences Output: Tree T leaf-labeled with S Assumptions
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationAssessing Phylogenetic Hypotheses and Phylogenetic Data
Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods because most data includes potentially misleading evidence of relationships We should not be content with constructing
More informationA Bayesian Approach to Phylogenetics
A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationBootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996
Bootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996 Following Confidence limits on phylogenies: an approach using the bootstrap, J. Felsenstein, 1985 1 I. Short
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationInferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution
Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods
More informationThanks to Paul Lewis and Joe Felsenstein for the use of slides
Thanks to Paul Lewis and Joe Felsenstein for the use of slides Review Hennigian logic reconstructs the tree if we know polarity of characters and there is no homoplasy UPGMA infers a tree from a distance
More informationX X (2) X Pr(X = x θ) (3)
Notes for 848 lecture 6: A ML basis for compatibility and parsimony Notation θ Θ (1) Θ is the space of all possible trees (and model parameters) θ is a point in the parameter space = a particular tree
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationChapter 9 BAYESIAN SUPERTREES. Fredrik Ronquist, John P. Huelsenbeck, and Tom Britton
Chapter 9 BAYESIAN SUPERTREES Fredrik Ronquist, John P. Huelsenbeck, and Tom Britton Abstract: Keywords: In this chapter, we develop a Bayesian approach to supertree construction. Bayesian inference requires
More informationConsensus methods. Strict consensus methods
Consensus methods A consensus tree is a summary of the agreement among a set of fundamental trees There are many consensus methods that differ in: 1. the kind of agreement 2. the level of agreement Consensus
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;
More informationInferring Molecular Phylogeny
Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction
More informationInference for Single Proportions and Means T.Scofield
Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter
More informationLecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30
Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30 A non-phylogeny
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationLecture 6 Phylogenetic Inference
Lecture 6 Phylogenetic Inference From Darwin s notebook in 1837 Charles Darwin Willi Hennig From The Origin in 1859 Cladistics Phylogenetic inference Willi Hennig, Cladistics 1. Clade, Monophyletic group,
More informationPhylogenetic study of Diploschistes (lichen-forming Ascomycota: Ostropales: Graphidaceae), based on morphological, chemical, and molecular data
Vol. 62 (2) April 2013 International Journal of Taxonomy, Phylogeny and Evolution Electronic Supplement to Phylogenetic study of Diploschistes (lichen-forming Ascomycota: Ostropales: Graphidaceae), based
More information1. Can we use the CFN model for morphological traits?
1. Can we use the CFN model for morphological traits? 2. Can we use something like the GTR model for morphological traits? 3. Stochastic Dollo. 4. Continuous characters. Mk models k-state variants of the
More informationarxiv: v1 [q-bio.pe] 6 Jun 2013
Hide and see: placing and finding an optimal tree for thousands of homoplasy-rich sequences Dietrich Radel 1, Andreas Sand 2,3, and Mie Steel 1, 1 Biomathematics Research Centre, University of Canterbury,
More informationPhyQuart-A new algorithm to avoid systematic bias & phylogenetic incongruence
PhyQuart-A new algorithm to avoid systematic bias & phylogenetic incongruence Are directed quartets the key for more reliable supertrees? Patrick Kück Department of Life Science, Vertebrates Division,
More information7. Tests for selection
Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info
More informationIntraspecific gene genealogies: trees grafting into networks
Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation
More informationThanks to Paul Lewis, Jeff Thorne, and Joe Felsenstein for the use of slides
hanks to Paul Lewis, Jeff horne, and Joe Felsenstein for the use of slides Hennigian logic reconstructs the tree if we know polarity of characters and there is no homoplasy UPM infers a tree from a distance
More informationMolecular Evolution & Phylogenetics
Molecular Evolution & Phylogenetics Heuristics based on tree alterations, maximum likelihood, Bayesian methods, statistical confidence measures Jean-Baka Domelevo Entfellner Learning Objectives know basic
More informationHomoplasy. Selection of models of molecular evolution. Evolutionary correction. Saturation
Homoplasy Selection of models of molecular evolution David Posada Homoplasy indicates identity not produced by descent from a common ancestor. Graduate class in Phylogenetics, Campus Agrário de Vairão,
More informationRatio of explanatory power (REP): A new measure of group support
Molecular Phylogenetics and Evolution 44 (2007) 483 487 Short communication Ratio of explanatory power (REP): A new measure of group support Taran Grant a, *, Arnold G. Kluge b a Division of Vertebrate
More informationPhylogenetic analyses. Kirsi Kostamo
Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,
More informationHypothesis testing and phylogenetics
Hypothesis testing and phylogenetics Woods Hole Workshop on Molecular Evolution, 2017 Mark T. Holder University of Kansas Thanks to Paul Lewis, Joe Felsenstein, and Peter Beerli for slides. Motivation
More informationAppendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny
008 by The University of Chicago. All rights reserved.doi: 10.1086/588078 Appendix from L. J. Revell, On the Analysis of Evolutionary Change along Single Branches in a Phylogeny (Am. Nat., vol. 17, no.
More informationBIOL 428: Introduction to Systematics Midterm Exam
Midterm exam page 1 BIOL 428: Introduction to Systematics Midterm Exam Please, write your name on each page! The exam is worth 150 points. Verify that you have all 8 pages. Read the questions carefully,
More informationIntroduction to characters and parsimony analysis
Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships exist between individuals within populations These include ancestordescendent relationships and more indirect
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edge-weighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationLatent Trait Reliability
Latent Trait Reliability Lecture #7 ICPSR Item Response Theory Workshop Lecture #7: 1of 66 Lecture Overview Classical Notions of Reliability Reliability with IRT Item and Test Information Functions Concepts
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2018 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2018 University of California, Berkeley D.D. Ackerly Feb. 26, 2018 Maximum Likelihood Principles, and Applications to
More informationFrequentist Properties of Bayesian Posterior Probabilities of Phylogenetic Trees Under Simple and Complex Substitution Models
Syst. Biol. 53(6):904 913, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490522629 Frequentist Properties of Bayesian Posterior Probabilities
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationOne-minute responses. Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today.
One-minute responses Nice class{no complaints. Your explanations of ML were very clear. The phylogenetics portion made more sense to me today. The pace/material covered for likelihoods was more dicult
More informationBootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057
Bootstrapping and Tree reliability Biol4230 Tues, March 13, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Rooting trees (outgroups) Bootstrapping given a set of sequences sample positions randomly,
More informationReconstruire le passé biologique modèles, méthodes, performances, limites
Reconstruire le passé biologique modèles, méthodes, performances, limites Olivier Gascuel Centre de Bioinformatique, Biostatistique et Biologie Intégrative C3BI USR 3756 Institut Pasteur & CNRS Reconstruire
More informationPhylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.
Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationA Chain Is No Stronger than Its Weakest Link: Double Decay Analysis of Phylogenetic Hypotheses
Syst. Biol. 49(4):754 776, 2000 A Chain Is No Stronger than Its Weakest Link: Double Decay Analysis of Phylogenetic Hypotheses MARK WILKINSON, 1 JOSEPH L. THORLEY, 1,2 AND PAUL UPCHURCH 3 1 Department
More informationReview. A Bernoulli Trial is a very simple experiment:
Review A Bernoulli Trial is a very simple experiment: Review A Bernoulli Trial is a very simple experiment: two possible outcomes (success or failure) probability of success is always the same (p) the
More informationMaximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.
Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters - "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf
More informationMolecular Evolution, course # Final Exam, May 3, 2006
Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least
More informationAssessing Congruence Among Ultrametric Distance Matrices
Journal of Classification 26:103-117 (2009) DOI: 10.1007/s00357-009-9028-x Assessing Congruence Among Ultrametric Distance Matrices Véronique Campbell Université de Montréal, Canada Pierre Legendre Université
More informationAdvanced Experimental Design
Advanced Experimental Design Topic Four Hypothesis testing (z and t tests) & Power Agenda Hypothesis testing Sampling distributions/central limit theorem z test (σ known) One sample z & Confidence intervals
More informationIs the equal branch length model a parsimony model?
Table 1: n approximation of the probability of data patterns on the tree shown in figure?? made by dropping terms that do not have the minimal exponent for p. Terms that were dropped are shown in red;
More informationPhylogenetic Analysis
Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)
More informationPhylogenetic Analysis
Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)
More informationHypothesis testing (cont d)
Hypothesis testing (cont d) Ulrich Heintz Brown University 4/12/2016 Ulrich Heintz - PHYS 1560 Lecture 11 1 Hypothesis testing Is our hypothesis about the fundamental physics correct? We will not be able
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationImpact of errors on cladistic inference: simulation-based comparison between parsimony and three-taxon analysis
Impact of errors on cladistic inference: simulation-based comparison between parsimony and three-taxon analysis Valentin Rineau, René Zaragüeta I Bagils, Michel Laurin To cite this version: Valentin Rineau,
More informationMaximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018
Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 11 January 7, 2013 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline How to communicate the statistical uncertainty
More informationLab 9: Maximum Likelihood and Modeltest
Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2010 Updated by Nick Matzke Lab 9: Maximum Likelihood and Modeltest In this lab we re going to use PAUP*
More informationPhylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Bayesian Phylogenetic Analysis COMP 571 - Spring 2015 Luay Nakhleh, Rice University Bayes Rule P(X = x Y = y) = P(X = x, Y = y) P(Y = y) = P(X = x)p(y = y X = x) P x P(X = x 0 )P(Y = y X
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationExact Inference by Complete Enumeration
21 Exact Inference by Complete Enumeration We open our toolbox of methods for handling probabilities by discussing a brute-force inference method: complete enumeration of all hypotheses, and evaluation
More information(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationQuestions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.
Chapter 7 Reading 7.1, 7.2 Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.112 Introduction In Chapter 5 and 6, we emphasized
More informationUoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)
- Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the
More informationStatistical Data Analysis Stat 3: p-values, parameter estimation
Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationStat 101: Lecture 12. Summer 2006
Stat 101: Lecture 12 Summer 2006 Outline Answer Questions More on the CLT The Finite Population Correction Factor Confidence Intervals Problems More on the CLT Recall the Central Limit Theorem for averages:
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More informationPhylogenetic methods in molecular systematics
Phylogenetic methods in molecular systematics Niklas Wahlberg Stockholm University Acknowledgement Many of the slides in this lecture series modified from slides by others www.dbbm.fiocruz.br/james/lectures.html
More informationMultiple sequence alignment accuracy and phylogenetic inference
Utah Valley University From the SelectedWorks of T. Heath Ogden 2006 Multiple sequence alignment accuracy and phylogenetic inference T. Heath Ogden, Utah Valley University Available at: https://works.bepress.com/heath_ogden/6/
More informationAnatomy of a species tree
Anatomy of a species tree T 1 Size of current and ancestral Populations (N) N Confidence in branches of species tree t/2n = 1 coalescent unit T 2 Branch lengths and divergence times of species & populations
More informationAssessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition
Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition David D. Pollock* and William J. Bruno* *Theoretical Biology and Biophysics, Los Alamos National
More informationPhylogenetics. Andreas Bernauer, March 28, Expected number of substitutions using matrix algebra 2
Phylogenetics Andreas Bernauer, andreas@carrot.mcb.uconn.edu March 28, 2004 Contents 1 ts:tr rate ratio vs. ts:tr ratio 1 2 Expected number of substitutions using matrix algebra 2 3 Why the GTR model can
More informationE. Santovetti lesson 4 Maximum likelihood Interval estimation
E. Santovetti lesson 4 Maximum likelihood Interval estimation 1 Extended Maximum Likelihood Sometimes the number of total events measurements of the experiment n is not fixed, but, for example, is a Poisson
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationPhylogenetic Analysis
Phylogenetic Analysis Aristotle Through classification, one might discover the essence and purpose of species. Nelson & Platnick (1981) Systematics and Biogeography Carl Linnaeus Swedish botanist (1700s)
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More information