Questions we can ask. Recall. Accuracy and Precision. Systematics - Bio 615. Outline

Outline 1. Mechanistic comparison with Parsimony - branch lengths & parameters 2. Performance comparison with Parsimony - Desirable attributes of a method - The Felsenstein and Farris zones - Heterotachous data Derek S. Sikes University of Alaska 3. Confidence - Assessment (part 1): CI, consensus trees Confidence - Assessment of the Strength of Questions we can ask Are the data better than random - do they have signal? How much homoplasy is there? To what extent are particular elements of the trees (clades) supported? What alternative results can we reject? Do independent data sets corroborate or conflict with each other? Recall Stochastic error vs Systematic error These assessment methods help identify stochastic error How repeatable are the results? How strongly do the data support them? This is a measure of precision (which is hopefully related to accuracy) Accuracy and Precision Accuracy Accuracy is correctness. How close a measurement is to the true value. "" "(unless we know the true tree in "" "advance we cannot measure this)" Precision Precision is reproducibility. How closely two or more measurements agree with one another. (this we can measure!) 1

Recall Stochastic error vs Systematic error High accuracy High precision High accuracy Low precision All methods have assumptions - when violated they can produce systematic error Low accuracy High precision Low accuracy Low precision Confidence measures cannot detect systematic error - must use other methods to identify (compare methods that have different assumptions) Branch Support Measures Precision - Not Accuracy* - Random error +/- gone with huge dataset of 124,026 characters - Systematic error evident in ME analysis (tree on right) (even with corrected distance data!) - 100% branch support values indicate no stochastic error In other words (keep in mind) These methods may show a high precision but the tree can still be wrong due to systematic error and These methods may show a low precision but the tree can still be correct * Except possibly Bayesian posterior probabilities 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jacknifing 6. Posterior probability (see lecture on Bayesian) 1. Consistency Index 2. g1 statistic, PTP - test 3. Consensus trees 4. Decay index (Bremer Support) 5. Bootstrapping / Jacknifing 6. Posterior probability (see lecture on Bayesian) 2

Parsimony - tree scores are integers - often leads to many equally most-parsimonious trees e.g. 27,000 MPTs all length = 25 In contrast, log-likelihoods are real numbers and rarely are two different trees found with equal log-likelihoods e.g. 1 tree of -lnl = 1242.058 next best tree of -lnl = 1242.906 This leads to different approaches in assessing the strength of the phylogenetic signal for MP vs ML analyses Consistency Indices - interesting but less useful than other methods PTP-test, g1-statistic - rarely used Consensus trees - summary tree of all MPTs - more often used for MP than ML - also used for Bayesian Consistency Indices If all the characters have the same signal then the tree is more trustworthy The more agreement there is, the less homoplasy (more consistency) the characters will show on the most parsimonious tree We need statistics to measure consistency CI - Kluge & Farris 1969 How much homoplasy is there? Taxon 1 A C A T T T A Taxon 2 A C G A T T A Taxon 3 A G G A T A G Taxon 4 G A A A A C? Taxon 5 G A T A? C G ObsL 1 2 3 1 1 2 1 Min L 1 2 2 1 1 2 1 Minimum length overall = 10 Length of MP tree = 11 Consistency Index (C.I.) = minimum number of changes required by data set number of changes on tree Higher CI means lower homoplasy CI value for tree or character Consistency index CI = Min L = 10 = 0.91 Obs L 11 Homoplasy index HI = 1-CI = 0.09 3

How much homoplasy is there? MacClade 4.0 Characters colored by their CI Red = CI of 1.0 (change once) Blue = CI <1.0 (change > 1 ) Taxon 1 A C A T T T A Taxon 2 A C G A T T A Taxon 3 A G G A T A G Taxon 4 G A A A A C? Taxon 5 G A T A? C G ObsL 1 2 3 1 1 2 1 Min L 1 2 2 1 1 2 1 CI 1.0 1.0 0.67 1.0 1.0 1.0 1.0 Tree number 1 (rooted using user-specified outgroup)! Tree length = 405! Consistency index (CI) = 0.6519! Homoplasy index (HI) = 0.3481! CI excluding uninformative characters = 0.6347! HI excluding uninformative characters = 0.3653! Retention index (RI) = 0.8102! Rescaled consistency index (RC) = 0.5281! /-- nepalensis2! /------+- nepalensis3! /------------+ / nepalensis6! \---+ nepalensis12! /---------------+ /---------------------- podagricus2! \--------+---------------------- podagricus1! / melissae2! /-------------+ /+ melissae3! /-------------------+ melissae1! \-------------------+ / quadripuncta1! /---------+ \------------+-- quadripuncta2! / trumboi2! \--------------------------------------------------+ trumboi1! /---+ /- maculifrons2! /--------------+--- maculifrons3! \-------------------+------------------ montivagus2! / sayi1! +---------------------------------+ sayi2! /---- humator2! \-------------------+-- humator3! Retention Index Taxon 1 A C A T T T A Taxon 2 A C G A T T A Taxon 3 A G G A T A G Taxon 4 G A A A A C? Taxon 5 G A T A? C G Min L 1 2 2 1 1 2 1 Max L 2 3 3 1 1 3 2 Maximum length overall = 15 Retention index (RI) = MaxL - ObsL = 15-11 = 4 = 0.80 MaxL - MinL = 15-10 5 Farris (1989) - improvements over the CI - Downweights homoplastic characters - Excludes autapomorphies - Goes to 0.0 if Max change = Observed change (CI doesn t go to 0.0, hard to interpret) General trends observed with CI/RI s Strong negative correlation between taxon number and CI and RI Data sets with few characters can show unexpected high CI and RI Not a very reliable measure of strength of signal 4

How can we evaluate the significance of CI/RI? Permuting data removes phylogenetic signal CI depends directly on tree length We can compare the observed tree length with what we would obtain if there were no phylogenetic signal A permutation tail probability (PTP) tests the proportion of permuted data sets with as good or better measure of quality than the real data Taxon 1!ACATTTA! Taxon 2!ACGATTA! Taxon 3!AGGATAG! Taxon 4!GAAAAC?! Taxon 5!GATA?CG! Randomize states within a character Permuted data sets Taxon 1!GAAA?AA! Taxon 2!ACAATC?! Taxon 3!GAGTATG! Taxon 4!AGTATCG! Taxon 5!ACGATTA! PTP test in PAUP* permutation test = PTP! 1000 permutation test replicates completed! Time used = 5.83 sec! Results of PTP test:! Number of! Tree length replicates! -------------------------! 379* 1! 410 1! 411 1! 412 3! 413 10! 414 8! 415 24! 416 34! 417 43! 418 73! 419 81! 420 132! 421 142!!! Number of! Tree length replicates! -------------------------!! 422 135! 423 112! 424 88! 425 50! 426 36! 427 23! 428 2! 429 1! * = length for original! (unpermuted) data! P = 0.001000! Example without signal!!!number of!!!!number of! Tree length replicates! Tree length replicates! -------------------------! -------------------------! 1924 3!! 1940 6! 1926 1! 1941 7! 1927 4! 1942 4! 1928 1! 1943 2! 1929 2! 1944 1! 1930 8! 1945 1! 1931 6! 1946 1! 1932 5! 1947 1! 1933 4! 1950 3! 1934 4! 1952 1! 1935 5! 1953 1! 1936 1! 1955 1! 1937 8! 1958 1! 1938* 11! The permuted data are 1939 7! better than the real data! A data set without signal g1 statistic - a measure of skewness, more skew = more signal bell curve = random, noisy data, weak to no signal mean=599.182107 sd=4.944738 g1=-0.150922! 582.00000 /------------------------------------------------------------------------! 583.80000 (5)! 585.60000 # (25)! 587.40000 ### (71)! 589.20000 ######### (209)! 591.00000 ####### (161)! 592.80000 ####################### (521)! 594.60000 ####################################### (883)! 596.40000 ################################################## (1132)! 598.20000 ################################################################# (1469)! 600.00000 ################################### (788)! 601.80000 ######################################################################## (1631)! 603.60000 ################################################################## (1486)! 605.40000 ############################################## (1047)! 607.20000 ######################### (567)! 609.00000 ####### (157)! 610.80000 ######## (171)! 612.60000 ### (57)! 614.40000 (11)! 616.20000 (3)! 618.00000 (1)! \------------------------------------------------------------------------! A data set with signal g1 statistic - a measure of skewness, more skew = more signal bell curve = random, noisy data, weak to no signal mean=611.572872 sd=31.049455 g1=-0.942643! 501.00000 /------------------------------------------------------------------------! 508.65000 # (15)! 516.30000 ## (60)! 523.95000 ### (84)! 531.60000 ##### (135)! 539.25000 # (21)! 546.90000 # (26)! 554.55000 ### (96)! 562.20000 ###### (166)! 569.85000 ########## (290)! 577.50000 ########################## (737)! 585.15000 ######################################## (1118)! 592.80000 ######################## (665)! 600.45000 #### (120)! 608.10000 ########## (268)! 615.75000 ################## (497)! 623.40000 ############################ (796)! 631.05000 ############################################### (1337)! 638.70000 ######################################################################## (2031)! 646.35000 ######################################################### (1610)! 654.00000 ########### (323)! \------------------------------------------------------------------------! 5

/------------------------------------------------------------------------------ 379 (1) 380 (3) 381 (1) 382 (5) 383 (4) 384 (5) 385 (7) 386 (8) 387 (15) 388 (19) 389 (20) 390 (22) 391 (20) 392 # (40) 393 # (51) 394 # (46) 395 # (58) 396 ## (78) 397 ## (79) 398 ## (97) 399 ## (110) 400 ## (112) 401 ### (148) 402 ### (162) 403 #### (170) 404 #### (211) 405 ##### (228) 406 ##### (256) 407 ###### (291) 408 ####### (307) 409 ####### (312) 410 ######## (374) 411 ######### (403) 412 ########## (492) 413 ########### (526) 414 ############ (552) 415 ############# (628) 416 ############### (715) 417 ################# (779) 418 #################### (928) 419 ##################### (971) 420 ###################### (1024) 421 ######################## (1108) 422 ######################### (1165) 423 ############################# (1365) 424 ################################ (1507) 425 #################################### (1691) 426 ##################################### (1742) 427 ########################################## (1960) 428 ########################################## (1958) 429 ############################################# (2107) 430 ############################################## (2178) 431 #################################################### (2451) 432 ####################################################### (2603) 433 ######################################################## (2648) 434 ############################################################ (2810) 435 ################################################################ (3007) 436 ################################################################# (3050) 437 ############################################################### (2971) 438 ################################################################# (3038) 439 ################################################################## (3112) 440 ################################################################### (3131) 441 ###################################################################### (3265) 442 ################################################################### (3128) 443 ####################################################################### (3326) 444 ########################################################################## (3475) 445 ############################################################################# (3616) 446 ############################################################################ (3566) 447 ############################################################################ (3573) 448 ############################################################################## (3661) 449 ############################################################################## (3644) 450 ############################################################################ (3567) 451 ############################################################################# (3632) 452 ############################################################################# (3616) 453 ############################################################################ (3554) 454 ###################################################################### (3274) 455 #################################################################### (3202) 456 ################################################################# (3074) 457 ############################################################## (2902) 458 ################################################################# (3056) 459 ############################################################### (2947) 460 ########################################################## (2739) 461 ################################################## (2358) 462 ############################################## (2181) 463 ########################################### (2026) 464 #################################### (1678) 465 ############################ (1322) 466 ##################### (966) 467 ################# (776) 468 ########## (488) 469 ####### (307) 470 #### (187) 471 ## (86) 472 # (39) 473 (12) 474 (11) 475 (1) \------------------------------------------------------------------------------ Systematics - Bio 615 Frequency distribution of tree scores: mean=442.504629 sd=14.368220 g1=-0.556859 g2=0.042436 Tests for phylogenetic signal (g1 and PTP) Are sensitive to any signal in the data For example g1 of permuted data = -0.04 (ns) Duplicate one taxon and g1 = -1.56** Useful for identifying truly useless data (very rare) But otherwise does not tell you much about data quality Thus, not in your text or Page & Holmes (1998) Consensus & branch support CI & PTP methods seek to determine overall data quality as a guide to whether we should believe particular results We can, instead, evaluate particular results Clade support measures: bootstrap/decay Statistical tests of alternative hypotheses Terms - from lecture & readings precision accuracy consistency index g1 statistic homoplasy index retention index PTP test Study questions What do we mean when we say a method relaxes an assumption? [Compare how the JC69 and more complex models (eg K2P, HKY, or GTR) treat the Ts/Tv ratio parameter.]" Why is Quantifying the uncertainty of a phylogenetic estimate at least as important a goal as obtaining the phylogenetic estimate itself.? Do assessment methods like bootstrapping attempt to measure accuracy or precision? Both stochastic and systematic error can affect accuracy and precision - How can we can minimize one of these types of error? And by doing so what can we maximize - accuracy or precision?" The g1 statistic and the PTP test are not often used for assessment - what is that they can tell us about our data?" 6