MOLECULAR PHYLOGENETIC TREE USING THREE DIFFERENT METHODS BASED ON P-DISTANCE MODEL DEA RYNANDA PUTRI
|
|
- Joanna Phoebe Walton
- 5 years ago
- Views:
Transcription
1 MOLECULAR PHYLOGENETIC TREE USING THREE DIFFERENT METHODS BASED ON P-DISTANCE MODEL DEA RYNANDA PUTRI DEPARTMENT OF STATISTICS FACULTY OF MATHEMATICS AND NATURAL SCIENCES BOGOR AGRICULTURAL UNIVERSITY 2010
2 ABSTRACT DEA RYNANDA PUTRI. Molecular Phylogenetic Tree using Three Different Methods based on p-distance Model. Advised by ASEP SAEFUDDIN and MULADNO. Phylogenetic inference is needed in describing the relationship between proteins, genes or species. In phylogeny, the object is assumed to be evolutionary related. The evolutionary tree is used to show the evolutionary relationship among the organisms. However, to build reliable evolutionary tree, reliable set of data is needed to find the best model. In this paper, the data is obtained from D-loop region in mitochondrion DNA (mtdna) that is available in Gen Bank. Five different species of animals were used, those were: Bison bison, Bos taurus, Bos indicus, Bubalus bubalis, and Capra hircus. The objective was to obtain the most reliable method, measured by its stability among UPGMA, Minimum Evolution, and Neighbor-Joining. To build the cases, five species were grouped into seven classes that have different characters. P-distance model was used to build the distance matrices. The reliability of each method was measured using the Felsentein s bootstrap method. The whole bootstrap process for each method will be repeated 100, 1000, and times to detect its reliability. Almost all methods do not have the misclassified problems in reconstructing the evolutionary tree. However, Minimum Evolution failed to reconstruct a reliable evolutionary tree compared to UPGMA and Neighbor-Joining. Key words : phylogenetic inference, d-loop mitochondrion DNA, evolutionary tree
3 MOLECULAR PHYLOGENETIC TREE USING THREE DIFFERENT METHODS BASED ON P-DISTANCE MODEL DEA RYNANDA PUTRI G Research Report to complete the requirement for graduation of Bachelor Degree in Statistics at Department of Statistics Faculty of Mathematics and Natural Sciences Bogor Agricultural University DEPARTMENT OF STATISTICS FACULTY OF MATHEMATICS AND NATURAL SCIENCES BOGOR AGRICULTURAL UNIVERSITY 2010
4 Title : Molecular Phylogenetic Tree using Three Different Methods based on p-distance Model Author : Dea Rynanda Putri NIM : G Approved by : Advisor I Advisor II Dr. Asep Saefuddin, M.Sc. NIP Prof. Dr. Muladno Basar, MSA NIP Acknowledged by : Head of Department of Statistics Dr. Ir. Hari Wijayanto, M. Si NIP Graduation date:
5 BIOGRAPHY Dea Rynanda Putri was born in Jakarta, on 27 th September 1988, as the daughter of Hedi Suhardi and Mary L Sutanto. She has one younger sister. She graduated from SD Kristen I BPK Penabur Jakarta in 2000 and from SLTP Kristen II BPK Penabur Jakarta in After graduated from SMA Negeri 68 Jakarta in 2006, she continued her study in Bogor Agricultural University through USMI. A year later, she chose Statistics as her major in Department of Statistics, and Monetary and Actuarial Mathematics from Department of Mathematics as her minor subject. During her studies, she was active in collage organization, such as International Association of Students in Agricultural and Related Sciences (IAAS) and IPB Debating Community (IDC). She was trusted to be a secretary of IDC during the period , and member of Exchange Program Department in IAAS during the period In 2009, she joined the International IndoMS Conference on Mathematics and It s Applied (IICMA) in Gadjah Mada University to present a research paper collaborated with Farid M Affendi, M.Si. And in the same year, she joined The 16 th Tri-U International Student Symposium in Mie University, Japan to present a research paper. On November 2010, she is going to join The 17 th Tri-U International Student Symposium in Chiang Mai University, Thailand. In February 2009, she had an opportunity to follow an internship program in Laboratory of Molecular Genetics, located in Faculty of Animal Husbandry, Bogor Agricultural University.
6 ACKNOWLEDGEMENTS Thanks be to God, many grateful to my beloved Jesus Christ Who gives me endless chance, spirit, health, and capability, especially in finishing my research. This paper is the representation of my research in bioinformatics. It was performed to complete a requirement for graduation of Bachelor Degree in Statistics, at Department of Statistics, Faculty of Mathematics and Natural Sciences, Bogor Agricultural University. I have to admit that the completion of my research would not be possible without help from many people, started from the beginning, during the progress, until it was done. Thousand appreciations are presented for their ideas, critics, and improvement during the process. I would like to express my sincere gratitude to my advisors, Dr. Asep Saefuddin for his expert guidance and suggestion for this research, and Prof. Muladno for the enlightening suggestions and discussions. I would like to thank all my friends in Statistika 43, IAAS, and Tri-U delegations for the togetherness in finding knowledge and giving truly friendships. My special gratefulness is tribute to: Apri, Anita, Boer, Defri, TW, Nadia, and Nia for all the time we passed and nights we spent. I would like thank all friends in Laboratory of Molecular Genetics who helped me during the internship program, especially to Dr. Jakaria who helped me much in understanding the bioinformatics field. I am so grateful for my beloved family: Pap, Bunbun, and my only Gunz for their never ending love and support. Finally, I wish that this little work could be useful for all. Bogor, July 2010 Dea Rynanda Putri
7 CONTENT LIST OF FIGURE viii LIST OF TABLE viii LIST OF APPENDIX viii INTRODUCTION 1 Background 1 Objective 1 LITERATURE REVIEW 1 D-loop Mitochondrion DNA 1 Evolutionary Tree 2 p-distance 2 Distance Matrix Method 2 UPGMA 2 Minimum Evolution 2 Neighbor Joining 3 Bootstrap 3 Bootstrap Variance 3 METHODOLOGY 4 Data Sources 4 Methods 4 RESULT AND DISCUSSION 4 Distance Matrix 4 UPGMA s Performance in Reconstructing the Evolutionary Tree 4 ME s Performance in Reconstructing the Evolutionary Tree 5 NJ s Performance in Reconstructing the Evolutionary Tree 5 Performance Comparison of UPGMA, ME, and NJ in Reconstructing the Evolutionary Tree 6 CONCLUSION 7 RECOMMENDATION 7 REFERENCE 7 Page
8 LIST OF FIGURE Figure 1 Mitochondria Region 2 Figure 2 Evolutionary Tree s Component 2 Figure 3 The Evolutionary Tree with for 100, 1000, and repeated times respectively of Group A (a) and Group E (b) 5 Figure 4 Consensus Tree of Group C using ME with respectively for 100, 1000, and repeated times 5 Figure 5 Consensus Tree of Group E using NJ with respectively for 100, 1000, and repeated times 5 Figure 6 Consensus Tree of Group G using UPGMA (a), ME (b), and NJ (c) respectively for 100, 1000, and repeated times 6 Page LIST OF TABLE Table 1 Overall Mean and Standard Error for each Group 4 Page LIST OF APPENDIX Appendix 1 Available Information of Bison bison 8 Appendix 2 Nucleotide Composition 9 Appendix 3 Nucleotide Pair Frequencies 9 Appendix 4 Page Consistency comparison among UPGMA (a), ME (b), and NJ(c) conducted to all built cases 10 Appendix 5(a) Statistics of the built cases Group A, Group C, Group E, and Group F 11 Appendix 5(b) Statistics of the built cases Group G, Group D, and Group B 12 Appendix 6 Constructed evolutionary tree for Group B using UPGMA (a), ME(b), and NJ (c) 13 Appendix 7 Original constructed tree of Group B using UPGMA with repeated times respectively are 100, 1000, and Appendix 8 Original constructed tree of Group B using NJ with repeated times respectively are 100, 1000, and Appendix 9 Original constructed tree of Group B using ME with repeated times respectively are 100, 1000 and Appendix 10 Comparison of computational time among all methods and cases 16
9 1 INTRODUCTION Background Systematic biologists for centuries have striven to expose the natural order of living things, and for the past 150 years (since Darwin 1859) this endeavor has focused largely on inferring phylogeny. But unfortunately, evolution is not something that we can see. It has only happened once and leaves behind clues as to what happened. Many methods, in addition to intuition, have been developed to be used in phylogeny reconstruction. Early efforts to reconstruct phylogeny were based on morpho-logical data, but as molecular characters became accessible, they were quickly integrated into phylogenetic analyses. With these conditions, the phylogenetic systematists use these clues to try to reconstruct the evolutionary relationship by using the evolutionary tree. The phylogenetic systematic is very important in order (Elfaizi 2004): (1) to explain the history of evolution, (2) to map the variance of patogenous thread for vaccines, (3) to find the causes of disease or to find the genetics effect of disease or to find the genetics effect of disease, (4) to predict the function of new found genes, (5) to analyze the biodiversity, and (6) to have further understanding of the ecology of microbes. The reconstruction of evolutionary trees by using statistical methods was initiated independently in numerical taxonomy for morphological characters and in population genetics for gene frequency data (Nei & Kumar 2000). Some of the statistical methods developed for these purposes are still used for phylogenetic analysis in molecular data, but in recent years many new methods have been developed. There are some sources of information that could be used in reconstructing the evolutionary trees, such as: characters, traits, anatomical and physiological characteristics, behaviors, or genetic sequences. New and better data could change the outcome of the evolutionary trees and shows different way that the organisms are related. In this paper, we used the D-loop mitochondrion DNA sequences. MtDNA sequence variations have been widely applied in population genetics studies of animals due the maternal inheritance and high substitutions of this organelle genome. Displacement loop or D-loop is an area in mitochondria that highly varied. With the increasing emphasis on tree reconstruction, questions arose as to how confident one should be in a given phylogenetic tree and how support for phylogenetic trees should be measured. Felsenstein (1985, refers to Soltis & Soltis 2003) formally proposed bootstrapping as a method for obtaining confidence limits on phylogenies. D-loop mtdna sequences of five different species were used to compare the performance of each method: UPGMA, Minimum Evolution (ME), and Neighbor-Joining (NJ). The performance was measured using two aspects: computational times and consistency. Otherwise, the consistency was measured using bootstrap procedure. In built the cases, those five used species were: Bison bison, Bos taurus, Bos indicus, Bubalus bubalis, and Capra hircus available in Gen Bank. Objective The main objectives of this research were: 1. To compare the phylogenetic inferences which are based on distance methods. 2. To describe the characteristic of each inferences. 3. To find out which method is more reliable in which cases. 4. To help the molecular biologist to determine which method is more suitable for their data. LITERATURE REVIEW D-loop Mitochondrion DNA Mitochondrion DNA (mtdna) is the DNA located in organelle called mitochondria, structures within cells that convert the energy from food into a form that cells can use. MtDNA is located in the cytoplasm of the cell. In mammals, each double-stranded circular mtdna molecule consists of 15,000-17,000 base pairs of 37 genes, 13 are for proteins Figure 1 Mitochondria Region
10 2 (polypeptides), 22 are transfer RNA (trna) and two are for the small and large subunits of ribosomal RNA (rrna). D-loop occurs in the main non-coding area of the mtdna molecule, a segment called the control region. Certain bases within the D- loop region are conserved, but large parts are highly variable. Evolutionary Tree Phylogenetic describes the relationship between genes, proteins, or species. In phylogenic, the objects are being assumed to be evolutionary related. The evolutionary tree is used to show the evolutionary relationship between the organisms. To build the correct evolutionary tree, we also need a correct and proper data. The correct and proper data could be (Li 2001): (1) taxa: the groups of organisms that we are interested to know the evolutionary relationship, (2) characters: a list of organism phenotype characteristics and some groups of organisms that have different phenotype characteristics. The components of the evolutionary tree are mentioned in Figure 2. There are two methods in building the evolutionary tree (Nei & Kumar 2000): (1) distance methods and (2) characteristic methods. The distance methods or distance matrix methods, evolutionary distances are computed for all pairs of taxa, and an evolutionary tree is constructed by considering the relationships among these distance values. Figure 2 Evolutionary Tree s Component p-distance This distance is merely the proportion (p) of nucleotide sites at which the two sequences compared are different. This is obtained by dividing the number of nucleotide differences by the total number of nucleotides compared. Thus, (1) The computation of this distance is simple and for constructing phylogenetic trees it gives essentially the same results as the more complicated distance measures, as long as all pairwise distances are small. The assumption of this model is that the rate of nucleotide substitution is the same for all evolutionary lineages. Distance Matrix Methods In distance method or distance matrix method, evolutionary distances are computed for all pairs of taxa, and an evolutionary tree is constructed by considering the relationship among these distance value. There are many different methods of constructing trees from distance data. UPGMA The simplest method in distance method category is the Unweighted Pair-Group Method using Arithmetic Average (UPGMA). Sokal and Michener (1958, refers to Nei & Kumar 2000) are the first authors who introduced the use of this method. A tree constructed by this method is sometimes called a phenogram, because it was originally used to represent the extent of phenotypic similarity for a group of species in numerical taxonomy. However, it can be used for constructing molecular phylogenies when the rate of gene substitution is more or less constant. Assume that stands for the distance between -th and -th taxa. Clustering of taxa starts with a pair of two taxa with the smallest distance. Suppose that is the smallest among all distance values. Taxa 1 and 2 are then clustered with a branch point located at distance. In UPGMA we assume that the lengths of the branches leading from this branch point to taxa 1 and 2 are the same. Taxa 1 and 2 are then combined into a single composite taxon or cluster, and the distance between this and another taxon is computed by. Therefore, we will have the new distance matrix. We continue the algorithm until there are no more taxa to be grouped in one cluster. Minimum Evolution The principle of this method is to find the best topologies which has the smallest number of, which describe as (2) where is the total number of branches. It is computed for all plausible topologies. An estimate of evolutionary distance, branch length, and the sampling error between sequence and respectively represented as
11 3, and. Using matrix algebra, the equation would be like (3) So that the LS estimate of is then given by (4) where. Obviously, an estimate of the length of the -th branch is (5) Neighbor-Joining Saitou and Nei (1987, refers to Nei & Kumar 2000) developed an efficient treebuilding method that is based on the minimum evolution principle. Construction of a tree by the NJ method begins with a star tree, which is produced under the assumption that there is no clustering of taxa. We then estimate the branch lengths of the star tree and compute the sum of all branches. This sum should be greater than the sum for the final NJ tree (6) where is the total number of sequence used, is the branch length estimate between nodes and, and. In practice, since we do not know which pairs of taxa are true neighbors, we consider all pairs of taxa as a potential pair of taxa are true. We then choose the taxa and that show the smallest value (Equation 7). This procedure is repeated until the final tree is produced. (7) where and. Once the smallest determined, we can create a new node that connects taxa and. The branch lengths is given by the following formula: (8) (9) The next following step is to compute the distance between the new node ( ) and the remaining taxa. (10) Bootstrap Bootstrap firstly introduced by Efron (1979) to obtain estimates of error in nonstandard situations by resampling the data set many times to provide a distribution against which hypotheses could be tested. On 1985, Felsentein (referst to Soltis & Soltis 2003) formally proposed bootstrapping as a method for obtaining the confidence limits on phylogenies. We can indicate the tree-building process schematically as where is an estimated distance matrix. Felsentein s method proceeds as follows. A bootstrap data matrix is formed by randomly selecting columns from the original matrix. Then the original treebuilding algorithm is applied to, giving a bootstrap tree as Then, the proportions of bootstrap trees agreeing with the original tree are calculated. These proportions are the bootstrap confidence values. Bootstrap Variance In this paper, the bootstrap method was used to compute the variances of distance measure. The procedure for the bootstrap method in resampling the nucleotide sequences with base pairs lengths is the same way introduced before, where the random sample is produced by resampling the nucleotide sites (columns) with replacement. When the bootstrap resampled data set is obtained, distance estimations are then computed using Equation 1 for each sequence. This procedure is repeated times. We denote by, the value for the -th bootstrap replication. The bootstrap variance is then computed by (11) where is the mean of over all replications. One assumption often made for the bootstrap is that all sites evolve independently. This assumption of course does not hold in the present case. However, if the number of sites examined is large as in the present case, the effect of violation of the assumption is not important, because most sites with different evolutionary rates will be represented in each bootstrap sample. METHODOLOGY Data Sources For this research, the mtdna complete genome data from five common species of animals were obtained from Gen Bank ( for free. The data was accessed on April 2, The species
12 4 were: Bison bison, Bos taurus, Bos indicus, Bubalus bubalis, and Capra hircus. The information about d-loop region location was obtained from the information available as shown in Appendix 1. Methods The procedures that were conducted for this research are: 1. Downloaded the complete mtdna sequence from Gen Bank. The available data sets were: a. Bison bison (2) b. Bos taurus (10) c. Bos indicus (3) d. Bubalus bubalis (3) e. Capra hircus (7) Numbers in the bracket shows the amount of sequences that was downloaded from Gen Bank. 2. Reduced the data from complete mtdna sequence to D-loop region only in mtdna for all used sequences. The total number of the D-loop mtdna sequence is around 1,122 base-pairs length (before the gaps edited). 3. Aligned all data sets. It is necessary to make the numbers of nucleotide of the sequences compared to be the same. 4. Both insertions and deletions introduced gaps in the DNA sequence alignment due to the alignment procedure, so all gaps deletion in the data sets was needed. The total length of the D-loop mtdna sequence here already reduced to 882 base-pairs length. 5. Built the cases by making some groups of taxon, which are: a. Group A consists of: Bison bison (2), Bos taurus (2), Bos indicus (2), Bubalus bubalis (2), Capra hircus (2). b. Group B consists of: Bison bison (2), Bos taurus (10), Bos indicus (3), Bubalus bubalis (3), Capra hircus (7). c. Group C consists of: Bison bison (2), Bos taurus (1), Bos indicus (1), Bubalus bubalis (1), Capra hircus (1). d. Group D consists of: Bison bison (1), Bos taurus (10), Bos indicus (1), Bubalus bubalis (1), Capra hircus (1). e. Group E consists of: Bison bison (1), Bos taurus (1), Bos indicus (3), Bubalus bubalis (1), Capra hircus (1). f. Group F consists of: Bison bison (1), Bos taurus (1), Bos indicus (1), Bubalus bubalis (3), Capra hircus (1). g. Group G consists of: Bison bison (1), Bos taurus (1), Bos indicus (1), Bubalus bubalis (1), Capra hircus (7). Numbers in the brackets show the amount of sequences that was used to build the cases. The sample of species used in a group was selected randomly from available sequences. 6. Constructed the UPGMA, ME, and NJ based on p-distance model. The standard errors of overall mean of estimated distance for all groups were counted using the bootstrap procedure with 1000 repeated times. 7. Checked the reliability of each method using the bootstrap procedure that was repeated 100, 1000, and times. All procedure was conducted using MEGA RESULT AND DISCUSSION Distance Matrix In Table 1, we can see the overall mean of estimated distance for all groups. The standard error of this model is relatively small and constant. The standard error is computed using bootstrap procedure with 1000 repeated times. With these results, the p-distance model is reliable enough to be used in constructing the evolutionary tree. Table 1 Overall Mean and Standard Error for each Group Group Mean S.E A B C D E F G UPGMA s Performance in Reconstructing the Evolutionary Tree The reliability of UPGMA is averagely good in all conditions. In group C and group F, UPGMA shows stable topologies through the changing of repeated times using bootstrap procedure. UPGMA failed to construct a reliable topology to describe the relationship between BosIndicus(1) and BosIndicus(2) in group A and group E. The bootstrap confidence value goes down when the repeated times was changing from 100 to 1000 (from 100% to 99%), but constant in
13 5 Figure 3 The Evolutionary Tree with group E (b) a for 100, 1000, repeated times respectively of group A (a) and b 99% when the repeated times changed from 1000 to 10000, as you might see in Figure 3. From seven groups used to compare the reliability of each method, only two groups are stable from repeated times, while the repeated times changed from times, there are four consistence groups. No conclusion could be obtained in the reconstructing of group B where the was went up and down as the repeated times changed. But there was not topologies changing the sequences. ME s Performance in Reconstructing the Evolutionary Tree The reliability of ME is very low, especially when the number of used sequences is increase. In group B and group D where the number of used sequences respectively are 25 and 14, the consistence shown was low (Appendix 4). Just like UPGMA, ME could not gain a stable in group B. In fact, the went down as the changing of repeated times (Appendix 9). ME shows the consistence performance only in group F, while in group C where the number of sequences is only six, the instability happened in two nodes. The first one is the interior branches that joined BosIndicus-BosTaurus with the changing of repeated times from (decreased from 100% to 99%) and the second one is the nodes that relates BisBison(1)-BisBison(2) with BosIndicus-BosTaurus with the changing of repeated times from (decreased from 100% to 99%). NJ s Performance in Reconstructing the Evolutionary Tree The reliability of NJ method is increasing as the number of repeated times increase for almost all cases. In group A, when the repeated moves from 100 to 1000, the value of that shows the relationship between BosIndicus(1)-BosIndicus(2) goes down from 100% to 99%, but steady in 99% when the repeated times moves to The same thing happened in group E, in the nodes of BosIndicus(1)-BosIndicus(2). The Appendix 4 showed that the numbers of consistence clades in NJ varied when the repeated times changed from , but mostly steady when the repeated times changed from It means that when the variation among the used sequence is high, 1000 repeated times were sufficient to see the reliability of NJ. Figure 5 Consensus Tree of group E using NJ with respectively for 100, 1000, repeated times Figure 4 Consensus Tree of group C using ME with respectively for 100, 1000, repeated times NJ was also failed in constructing a reliable evolutionary tree for group B. It failed in maintaining the for all repeated times, but compared to ME and UPGMA, NJ is the most reliable method in constructing the
14 6 evolutionary of group B since the repeated times was only Performance Comparison of UPGMA, ME, and NJ in Reconstructing the Evolutionary Tree Compared to others, ME has the longest computational time (Appendix 10), while UPGMA is the shortest one. It may due to the computational iteration in ME, where all possible topologies were constructed one by one to find the topology which has the smallest number of, while UPGMA classified two taxon that has the smallest genetic distance for instance. Appendix 4 shows the consistence comparison among UPGMA, ME, and NJ methods through the changing of repeated times from 100 to 1000 and from 1000 to that has applied to all built cases. From graphic in Appendix 4, we could see that UPGMA has the most consistence value compared to ME and NJ for almost every group. While the evolutionary tree conducted from NJ method shows consistency starts when the repeated times were All methods failed in reconstructing a reliable evolutionary tree for group B (see Appendix 5). NJ shows slightly different topologies relationships with UPGMA. But unfortunately, they failed in giving a consistence for the topologies. Due to this condition, further understanding about this case is needed. Appendix 5 shows the nucleotide composition s means and variances for all built cases. This information shows that compared to other cases, the nucleotide variance for group B was relatively small for each nucleotide compositions. The same thing happened in group D where the nucleotide variance for T(U), C, A, G respectively are 0.192, 0.169, 0.207, and Those are relatively small comparing to group F, which has the nucleotide variance respectively 0.298, 0.480, 1.367, and While in group G, all methods showed inconsistency in construct the topologies among Capra hircus (Figure 6), especially in describing the relationship between CapHircus(2) with CapHircus(1)- CapHircus(4). It may caused by the nucleotide composition between CapHircus(1), CapHircus(2), and CapHircus(4). They have a slight different between Cytosine (C) and Guanine (G). Where the percentage of Cytosine in CapHircus(1) and CapHircus(4) is 26.4% while in CapHircus(2) is 26.3%. Otherwise, the percentage of Guanine in CapHircus(1) and CapHircus(4) is 15.5% while in CapHircus(2) is 15.6%. CONCLUSION Under the assumption that the nucleotide substitution rate is the same for all evolutionary lineages, UPGMA is the most consistence distance method, followed by NJ and ME at last. UPGMA is a good distance method that could be used if someone interest limitedly on classified the sequences and the total branch length, because the branch length (a) (b) (c) Figure 6 Consensus Tree of CapHircus using UPGMA (a), ME (b), and NJ (c) respectively for 100, 1000, repeated times
15 7 between nearest taxon in UPGMA is assumed to be equal, so this method is not appropriate if someone would like to know the evolutionary distance of sequences partially. While NJ is a consistence distance method when the number of bootstrap repeated times is not less than 1000 times. But NJ is a good distance method if someone would like to have the information about the evolutionary distance among sequences. When the number of sequences is large and the extent of sequence divergence is low, the realized tree may have many interior branches with zero length unless a large number of nucleotides are examined. In this case, like shown in group B, it is generally difficult to reconstruct the true tree by any method. In this case, there is no need to examine them, because the tree would not be reliable anyway whichever tree-building method is being used. It is now clear that there is no method that is superior to other methods in all conditions and that some methods perform better than others under certain conditions but worse under other conditions. Therefore, even if many interior branches of a tree are not well supported by the bootstrap, the tree should not be discarded. It is a hypothetical tree, but it could be a correct one. RECCOMENDATION This research is based on many assumptions and suffered by several limitations. If the assumptions and boundaries can be relaxed, a better result could be expected. There are some recommendations for the next research, which are: 1. The pairwise distance model in this research is under the assumption that the nucleotide substitution rate is the same for all evolutionary lineages. It might not be reflected the real condition of mtdna sequences that could be varied in mutation and or substitution rate. It might give a better result if the distance model could reflect the real mutation and or substitution rate in the sequences. 2. It would be interesting if the empirical distance method could be applied from the current statistical method, for example using the Bayesian or maximum likelihood method to estimated the parameter (pairwise distance), in order to find the most reliable method. REFERENCE [Anonim] DNA Mitokondria. DNAmitokondria& [May 18, 2010]. [Anonim] Understanding Evolution. icle/phylogenetics_05 [May 18, 2010]. Backeljau T et.al Multiple UPGMA and Neigbor-Joining Trees and the Performance of Some Computer Packages. Mol Biol Evol 13(2): g [March 23, 2010]. Efron B, Tibshirani R An Introduction to the Bootstrap. New York: Chapman & Hall. Elfaizi MA, Aprijani DA Bioinformatika: Perkembangan, Disiplin Ilmu, dan Penerapannya di Indonesia. http// rg/copyleft/fdl.html. [January 26, 2010] Ewens WJ, Grant GR, Dietz K, editor Statistical Methods in Bioinformatics: An Introduction. New York: Springer-Verlag. Gascuel O, Bryant D, Denis F Strengths and Limitations of the Minimum Evolution Principle. Sys Biol 50(5): [April 21, 2010]. Holmes S Bootstraping Phylogenetic Trees: Theory and Methods. Stat Sci 18(2): Husmeier D A Brief Tutorial on Phylogenetics. [March 22, 2010]. Li Yan How to Build a Phylogenetic Tree. [March 22, 2010]. Nei M, Kumar S Molecular Evolution and Phylogenetics. New York: Oxford University Press. Singh K, Xie M Bootstrap: A Statistical Method. gers.edu/~mxie/ RCPaRCPa/bootstrap.pdf. [May 25, 2010]. Soltis SP, Soltis DE Applying the Bootstrap in Phylogeny Reconstruction. Stat Sci 18(2):
16 8 Appendix1 Available Information of Bison bison LOCUS NC_ bp DNA circular MAM 13-APR-2009 DEFINITION Bison bison mitochondrion, complete genome. ACCESSION NC_ VERSION NC_ GI: DBLINK Project:36339 KEYWORDS. SOURCE mitochondrion Bison bison (American bison) ORGANISM Bison bison Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Laurasiatheria; Cetartiodactyla; Ruminantia; Pecora; Bovidae; Bovinae; Bison. REFERENCE 1 (bases 1 to 16319) FEATURES // Location/Qualifiers source /organism="bison bison" /organelle="mitochondrion" /mol_type="genomic DNA" /db_xref="taxon:9901" /breed="american" D-loop join( ,1..360) trna /product="trna-phe" rrna /product="s-rrna" CDS /gene="cytb" /codon_start=1 /transl_table=2 /product="cytochrome b" /protein_id="yp_ " /db_xref="gi: " /db_xref="geneid: " /translation="mtnlrkshplmkivnnafidlpapsnisswwnfgsllgmcltlq ILTGLFLAMHYTSDTTTAFSSVAHICRDVNYGWIIRYMHANGASMFFICLYMHVGRGL YYGSYTFLETWNIGVILLLTVMATAFMGYVLPWGQMSFWGATVITNLLSAIPYIGTNL VEWIWGGFSVDKATLTRFFAFHFILPFIIMAIAMVHLLFLHETGSNNPTGISSDMDKI PFHPYYTIKDILGALLLILALMLLVLFTPDLLGDPDNYTPANPLNTPPHIKPEWYFLF AYAILRSIPNKLGGVLALAFSILILALIPLLHTSKQRSMIFRPLSQCLFWTLVADLLT LTWIGGQPVEHPYIIIGQMASIMYFLLILVLMPTAGTIENKLLKW" 1 actaatgact aatcagccca tgctcacaca taactgtgct gtcatacatt tggtattttt 61 ttattttggg ggatgcttgg actcagctat ggccgtcaaa ggccctgacc cggagcatct 121 attgtagctg gacttaactg caccttgagc accagcataa tggtaagcat gcacatatag 181 tcaatggtta caggacataa ctgtattata tatccccccc tccataaaaa ttccccctta 241 aatatttacc actgctttta acagattttt ccctagttac ctatttaaat tttccacact 301 ttcaatactc aaattagcac tccatataaa gtcaatatat aaacgcaggc cccccccccc 361 cgttgatgta gcttaaccca aagcaaggca ctgaaaatgc ctagatgagt ctcccaactc 1861 aataaatctc actgtaactt taaaagttaa tctaaaaagg tacagccttt tagaaacgga 1921 tacaaccttg actagagagt aaaatataac actaccatag taggcccaaa agcagccacc 1981 aattgagaaa gcgttaaagc tcaacaacaa aaattaaaca gatcccaata acaagtaatt 2041 aactcctagc cccaatactg gactaatcta ttattgaata gaagtaataa tgttagtatg 2101 agtaacaaga aaaactttct ccttgcataa gtctaagtca gtatctgata atactctgac cattaatgta ataaaaacat attatgtata tagtacatta aattatatgc cccatgcata taagcaagta cttatcctct attgacagta catagtacat aaagttatta attgtacata gcacattatg tcaaatctac ccttggcaac atgcatatcc cttccattag atcacgagct taattaccat gccgcgtgaa accagcaacc cgctaggcag aggatccctc ttctcgctcc gggcccatga accgtggggg tcgctattta atgaacttta tcagacatct ggttctttct tcagggccat ctcacctaaa atcgcccatt ctttcctctt aaataagaca tctcgatgg
17 9 Appendix 2 Nucleotide Composition T(U) C A G BisBison(1) BisBison(2) BosIndicus(1) BosIndicus(2) BosIndicus(3) BosTaurus(1) BosTaurus(2) BosTaurus(3) BosTaurus(4) BosTaurus(5) BosTaurus(6) BosTaurus(7) BosTaurus(8) BosTaurus(9) BosTaurus(10) BubBubalis(1) BubBubalis(2) BubBubalis(3) CapHircus(1) CapHircus(2) CapHircus(3) CapHircus(4) CapHircus(5) CapHircus(6) CapHircus(7) Avg Appendix 3 Nucleotide Pair Frequencies Domain ii si sv R TT TC TA TG CC CA CG AA GG Total avg
18 10 Appendix 4 Consistency comparison among UPGMA (a), ME (b), and NJ (c) conducted to all built cases UPGMA ME NJ Group A Group B Group C Group D Group E Group F Group G
19 11 Appendix 5(a) Statistics of the built cases Group A, Group C, Group E, and Group F Group A Group C Group E Group F Relationship constant constant constant constant UPGMA-NJ-ME UPGMA-NJ, ME - - UPGMA - ME - NJ UPGMA (1) constant (1) constant NJ (1) (1) (1) constant ME (2) (2) (2) constant No. of Sequence Sequence Lenth Description T(U) C A G T(U) C A G T(U) C A G T(U) C A G avg var
20 12 Appendix 5(b) Statistics of the built cases Group G, Group D, and Group B Group G Group D Group B Relationship constant varied unstable UPGMA-NJ-ME - UPGMA, NJ-ME - UPGMA (4) (5) (1) NJ (3) (6) (1) ME (4) (7) (2) No. of Sequence Sequence Lenth Description T(U) C A G T(U) C A G T(U) C A G avg var
21 13 Appendix 6 Constructed evolutionary tree for Group B using UPGMA (a), ME (b), and NJ (c) (a) (b) (c) BosTaurus(3) BosTaurus(4) BosTaurus(6) BosTaurus(7) BosTaurus(1) BosTaurus(2) BosTaurus(5) BosIndicus(2) BosTaurus(10) BosIndicus(1) BosTaurus(9) BosIndicus(3) BosTaurus(8) BisBison(1) BisBison(2) BubBubalis(3) BubBubalis(1) BubBubalis(2) CapHircus(7) CapHircus(6) CapHircus(3) CapHircus(5) CapHircus(2) CapHircus(1) CapHircus(4) BosTaurus(6) BosTaurus(7) BosTaurus(1) BosTaurus(3) BosTaurus(4) BosTaurus(2) BosTaurus(5) BosIndicus(2) BosTaurus(10) BosIndicus(1) BosTaurus(9) BosIndicus(3) BosTaurus(8) BisBison(1) BisBison(2) BubBubalis(3) BubBubalis(1) BubBubalis(2) CapHircus(7) CapHircus(6) CapHircus(3) CapHircus(1) CapHircus(4) CapHircus(2) CapHircus(5) BosTaurus(3) BosTaurus(4) BosTaurus(6) BosTaurus(7) BosTaurus(1) BosTaurus(2) BosTaurus(5) BosIndicus(2) BosTaurus(10) BosIndicus(1) BosTaurus(9) BosIndicus(3) BosTaurus(8) BisBison(1) BisBison(2) BubBubalis(3) BubBubalis(1) BubBubalis(2) CapHircus(7) CapHircus(6) CapHircus(3) CapHircus(5) CapHircus(2) CapHircus(1) CapHircus(4)
22 14 Appendix 7 Original constructed tree of Group B using UPGMA with repeated times respectively are 100, 1000, and Appendix 8 Original constructed tree of Group B using NJ with repeated times respectively are 100, 1000, and 10000
23 Appendix 9 Original constructed tree of Group B using ME with repeated times respectively are 100, 1000, and
24 16 Appendix 10 Comparison of Computational Time Among All Methods and Cases Group A Group B Group C Group D Group E Group F Group G Constructed Tree Bootstrap Tree UPGMA 01.1 s 01.3 s 01.5 s 05.9 s ME 01.5 s 01.7 s 01.9 s 07.5 s NJ 01.2 s 01.5 s 01.7 s 05.9 s UPGMA 01.3 s 01.5 s 03.4 s 21.8 s ME 01.6 s 01.7 s 04.1 s 31.1 s NJ 01.4 s 01.6 s 03.4 s 22.6 s UPGMA 01.3 s 01.4 s 01.6 s 03.9 s ME 01.3 s 01.7 s 03.0 s 04.4 s NJ 01.3 s 01.5 s 01.7 s 03.9 s UPGMA 01.4 s 01.4 s 01.9 s 08.7 s ME 01.5 s 01.5 s 03.7 s 12.1 s NJ 01.4 s 01.5 s 03.2 s 08.7 s UPGMA 01.4 s 01.5 s 01.7 s 04.3 s ME 01.5 s 01.6 s 01.9 s 04.8 s NJ 01.3 s 01.5 s 01.7 s 04.2 s UPGMA 01.3 s 01.4 s 01.6 s 04.3 s ME 01.5 s 01.5 s 02.8 s 04.7 s NJ 01.2 s 01.4 s 03.2 s 04.2 s UPGMA 01.4 s 01.5 s 01.8 s 06.3 s ME 01.5 s 01.6 s 02.8 s 08.2 s NJ 01.4 s 01.4 s 01.8 s 06.6 s
Agricultural University
, April 2011 p : 8-16 ISSN : 0853-811 Vol16 No.1 PERFORMANCE COMPARISON BETWEEN KIMURA 2-PARAMETERS AND JUKES-CANTOR MODEL IN CONSTRUCTING PHYLOGENETIC TREE OF NEIGHBOUR JOINING Hendra Prasetya 1, Asep
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationLecture 11 Friday, October 21, 2011
Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationPhylogenetic inference: from sequences to trees
W ESTFÄLISCHE W ESTFÄLISCHE W ILHELMS -U NIVERSITÄT NIVERSITÄT WILHELMS-U ÜNSTER MM ÜNSTER VOLUTIONARY FUNCTIONAL UNCTIONAL GENOMICS ENOMICS EVOLUTIONARY Bioinformatics 1 Phylogenetic inference: from sequences
More informationCHAPTERS 24-25: Evidence for Evolution and Phylogeny
CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology
More information8/23/2014. Phylogeny and the Tree of Life
Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major
More informationAlgorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,
Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin- 1837
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More informationPhylogeny: building the tree of life
Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan
More informationHow to read and make phylogenetic trees Zuzana Starostová
How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationPhylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.
Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More informationChapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships
Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic
More informationMultiple Sequence Alignment. Sequences
Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More informationConsistency Index (CI)
Consistency Index (CI) minimum number of changes divided by the number required on the tree. CI=1 if there is no homoplasy negatively correlated with the number of species sampled Retention Index (RI)
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 858-59: Origin of Species 5 year voyage of H.M.S. eagle (83-36) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationOMICS Journals are welcoming Submissions
OMICS Journals are welcoming Submissions OMICS International welcomes submissions that are original and technically so as to serve both the developing world and developed countries in the best possible
More informationUoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)
- Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the
More informationTaxonomy. Content. How to determine & classify a species. Phylogeny and evolution
Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature
More informationMOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"
MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS Masatoshi Nei" Abstract: Phylogenetic trees: Recent advances in statistical methods for phylogenetic reconstruction and genetic diversity analysis were
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood
More informationPhylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center
Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationHow should we organize the diversity of animal life?
How should we organize the diversity of animal life? The difference between Taxonomy Linneaus, and Cladistics Darwin What are phylogenies? How do we read them? How do we estimate them? Classification (Taxonomy)
More informationPhylogenetic Tree Generation using Different Scoring Methods
International Journal of Computer Applications (975 8887) Phylogenetic Tree Generation using Different Scoring Methods Rajbir Singh Associate Prof. & Head Department of IT LLRIET, Moga Sinapreet Kaur Student
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationa,bD (modules 1 and 10 are required)
This form should be used for all taxonomic proposals. Please complete all those modules that are applicable (and then delete the unwanted sections). For guidance, see the notes written in blue and the
More informationInferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution
Today s topics Inferring phylogeny Introduction! Distance methods! Parsimony method!"#$%&'(!)* +,-.'/01!23454(6!7!2845*0&4'9#6!:&454(6 ;?@AB=C?DEF Overview of phylogenetic inferences Methodology Methods
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationNJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees
NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana
More informationSTEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization)
STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization) Laura Salter Kubatko Departments of Statistics and Evolution, Ecology, and Organismal Biology The Ohio State University kubatko.2@osu.edu
More informationEffects of Gap Open and Gap Extension Penalties
Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See
More informationUsing phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)
Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationBio 1B Lecture Outline (please print and bring along) Fall, 2007
Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution
More informationInDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationMacroevolution Part I: Phylogenies
Macroevolution Part I: Phylogenies Taxonomy Classification originated with Carolus Linnaeus in the 18 th century. Based on structural (outward and inward) similarities Hierarchal scheme, the largest most
More informationLikelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution
Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions
More informationPHYLOGENY AND SYSTEMATICS
AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study
More informationMULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE
MULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE Manmeet Kaur 1, Navneet Kaur Bawa 2 1 M-tech research scholar (CSE Dept) ACET, Manawala,Asr 2 Associate Professor (CSE Dept) ACET, Manawala,Asr
More informationChapter 19: Taxonomy, Systematics, and Phylogeny
Chapter 19: Taxonomy, Systematics, and Phylogeny AP Curriculum Alignment Chapter 19 expands on the topics of phylogenies and cladograms, which are important to Big Idea 1. In order for students to understand
More informationEstimating Evolutionary Trees. Phylogenetic Methods
Estimating Evolutionary Trees v if the data are consistent with infinite sites then all methods should yield the same tree v it gets more complicated when there is homoplasy, i.e., parallel or convergent
More informationPhylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?
Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species
More informationQuantifying sequence similarity
Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationIntraspecific gene genealogies: trees grafting into networks
Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation
More informationPhylogenetic methods in molecular systematics
Phylogenetic methods in molecular systematics Niklas Wahlberg Stockholm University Acknowledgement Many of the slides in this lecture series modified from slides by others www.dbbm.fiocruz.br/james/lectures.html
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2009 University of California, Berkeley B.D. Mishler Jan. 22, 2009. Trees I. Summary of previous lecture: Hennigian
More informationWhat is Phylogenetics
What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)
More informationCladistics and Bioinformatics Questions 2013
AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species
More informationChapter 26. Phylogeny and the Tree of Life. Lecture Presentations by Nicole Tunbridge and Kathleen Fitzpatrick Pearson Education, Inc.
Chapter 26 Phylogeny and the Tree of Life Lecture Presentations by Nicole Tunbridge and Kathleen Fitzpatrick Investigating the Tree of Life Phylogeny is the evolutionary history of a species or group of
More informationReconstructing the history of lineages
Reconstructing the history of lineages Class outline Systematics Phylogenetic systematics Phylogenetic trees and maps Class outline Definitions Systematics Phylogenetic systematics/cladistics Systematics
More informationPhylogenetic analyses. Kirsi Kostamo
Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,
More informationPhylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline
Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying
More informationIsolating - A New Resampling Method for Gene Order Data
Isolating - A New Resampling Method for Gene Order Data Jian Shi, William Arndt, Fei Hu and Jijun Tang Abstract The purpose of using resampling methods on phylogenetic data is to estimate the confidence
More informationPhylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26
Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,
More informationMolecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016
Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,
More informationDarwin's theory of natural selection, its rivals, and cells. Week 3 (finish ch 2 and start ch 3)
Darwin's theory of natural selection, its rivals, and cells Week 3 (finish ch 2 and start ch 3) 1 Historical context Discovery of the new world -new observations challenged long-held views -exposure to
More informationModern Evolutionary Classification. Section 18-2 pgs
Modern Evolutionary Classification Section 18-2 pgs 451-455 Modern Evolutionary Classification In a sense, organisms determine who belongs to their species by choosing with whom they will mate. Taxonomic
More information1 ATGGGTCTC 2 ATGAGTCTC
We need an optimality criterion to choose a best estimate (tree) Other optimality criteria used to choose a best estimate (tree) Parsimony: begins with the assumption that the simplest hypothesis that
More informationBootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996
Bootstrap confidence levels for phylogenetic trees B. Efron, E. Halloran, and S. Holmes, 1996 Following Confidence limits on phylogenies: an approach using the bootstrap, J. Felsenstein, 1985 1 I. Short
More informationAssessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition
Assessing an Unknown Evolutionary Process: Effect of Increasing Site- Specific Knowledge Through Taxon Addition David D. Pollock* and William J. Bruno* *Theoretical Biology and Biophysics, Los Alamos National
More informationPlan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method
Phylogeny 1 Plan: Phylogeny is an important subject. We have 2.5 hours. So I will teach all the concepts via one example of a chain letter evolution. The concepts we will discuss include: Evolutionary
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571 - Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distance-based methods Evolutionary Models and Distance Correction
More informationCS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003
CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 Lecturer: Wing-Kin Sung Scribe: Ning K., Shan T., Xiang
More informationChapter 7: Models of discrete character evolution
Chapter 7: Models of discrete character evolution pdf version R markdown to recreate analyses Biological motivation: Limblessness as a discrete trait Squamates, the clade that includes all living species
More informationCurriculum Links. AQA GCE Biology. AS level
Curriculum Links AQA GCE Biology Unit 2 BIOL2 The variety of living organisms 3.2.1 Living organisms vary and this variation is influenced by genetic and environmental factors Causes of variation 3.2.2
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationMicrobial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B
Microbial Diversity and Assessment (II) Spring, 007 Guangyi Wang, Ph.D. POST03B guangyi@hawaii.edu http://www.soest.hawaii.edu/marinefungi/ocn403webpage.htm General introduction and overview Taxonomy [Greek
More informationBuilding Phylogenetic Trees UPGMA & NJ
uilding Phylogenetic Trees UPGM & NJ UPGM UPGM Unweighted Pair-Group Method with rithmetic mean Unweighted = all pairwise distances contribute equally. Pair-Group = groups are combined in pairs. rithmetic
More informationClassification, Phylogeny yand Evolutionary History
Classification, Phylogeny yand Evolutionary History The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize
More informationPhylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches
Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell
More informationMolecular phylogeny How to infer phylogenetic trees using molecular sequences
Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues
More informationLecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 4-2818 Jordan 6-057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationIntroduction to characters and parsimony analysis
Introduction to characters and parsimony analysis Genetic Relationships Genetic relationships exist between individuals within populations These include ancestordescendent relationships and more indirect
More informationPhylogeny and Molecular Evolution. Introduction
Phylogeny and Molecular Evolution Introduction 1 2/62 3/62 Credit Serafim Batzoglou (UPGMA slides) http://www.stanford.edu/class/cs262/slides Notes by Nir Friedman, Dan Geiger, Shlomo Moran, Ron Shamir,
More informationMicrobes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.
Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional
More informationMicrobial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; "fast- clock" molecules for fine-structure.
Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional
More informationBiology 211 (2) Week 1 KEY!
Biology 211 (2) Week 1 KEY Chapter 1 KEY FIGURES: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7 VOCABULARY: Adaptation: a trait that increases the fitness Cells: a developed, system bound with a thin outer layer made of
More informationDEVELOPMENT OF LAND SUITABILITY EVALUATION SYSTEM FOR COASTAL AQUACULTURE USING ARTIFICIAL NEURAL NETWORK AND GEOGRAPHICAL INFORMATION SYSTEMS
DEVELOPMENT OF LAND SUITABILITY EVALUATION SYSTEM FOR COASTAL AQUACULTURE USING ARTIFICIAL NEURAL NETWORK AND GEOGRAPHICAL INFORMATION SYSTEMS Case Study: Mahakam Delta, East Kalimantan I KETUT SUTARGA
More informationMicrobial Taxonomy and the Evolution of Diversity
19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy
More information