Genomic Comparison of Bacterial Species Based on Metabolic Characteristics

Size: px
Start display at page:

Download "Genomic Comparison of Bacterial Species Based on Metabolic Characteristics"

Transcription

1 Genomic Comparison of Bacterial Species Based on Metabolic Characteristics Gaurav Jain 1, Haozhu Wang 1, Li Liao 1*, E. Fidelma Boyd 2* 1 Department of Computer and Information Sciences University of Delaware Newark, DE 19716, USA 2 Department of Biological Sciences University of Delaware Newark, DE 19716, USA Abstract In this work, we developed a novel method to generate comparison trees based on characteristics collected from metabolic networks of bacteria. We characterize each bacterial genome s metabolism by the occurrence frequencies of various chemical reactions classified by enzyme commission numbers, and by the correlation of the reaction types for any two consecutive reactions in pathways present in the networks. In hypothesizing that species physiologically close to each other should show high similarity in these characteristics, we quantitatively measure the similarity using Pearson correlation coefficient, and build comparison trees using the Neighbor- Joining algorithm. These Metabolic Characteristics (MC) based comparison trees cluster the bacteria according to their functional groups and reveal the relationship between different organisms from a physiological perspective yielding new insights about the organisms. Keywords-genome comparison; metabolism; Pearson correlation I. INTRODUCTION Classifying organisms into an ordered scheme is important in understanding the fundamental biology of life. Traditionally, trees based on 16S rrna sequences are the main tool for studying molecular phylogeny of bacteria. The advent of new molecular techniques such as metabolic network reconstruction and simulations has opened new avenues to compare the organisms. Most metabolic reactions critical for proper functioning of cells are catalyzed by enzymes [7]. However, neither do all enzymes occur in all species nor do they have equal importance in each species. Further within the phylogeny of each species, the occurrence of enzymes is heterogeneous [5]. Because enzymes are inherent parts of metabolic networks, we can generate comparison trees based on enzymes phylogenetic properties to look at the relationship between different organisms from a physiological perspective. As more bacterial genomes are sequenced and the metabolic pathways of these organisms are reconstructed, it becomes possible to perform organism comparisons from a * Corresponding authors. biochemical-physiological perspective. Such comparisons may yield novel insights into the evolution of metabolic pathways and may be relevant to metabolic engineering of industrial microbes. Studies in this direction focusing on individual pathways have been attempted [1, 8]. In contrast to the classical view of metabolism, where relatively isolated sets of reactions or metabolic pathways allow the synthesis and degradation of compounds, the new perspective views metabolic pathway components such as substrates, products, cofactors, and enzymes as parts of a single whole network. Due to the fact that some functional properties like the small distance between reactions from different pathways are visible only when the metabolism is analyzed from a network perspective, it becomes less meaningful to define metabolism as just isolated pathways [11]. A metabolic network consists of all chemical transformations or reactions involved in metabolism in the cell, with the metabolites being interconnected by enzymecatalyzed reactions. Many enzymes are common in numerous species while others occur only in a few. Phylogenetic analysis of these metabolic components (substrates, products, cofactors, and enzymes) may expand the understanding of the evolutionary processes [13]. In order to study metabolism as a whole, there are two complementary ways to represent the metabolic network. First, metabolism can be represented with a compoundcentric network, wherein nodes (substrates and products) participating in the same reaction are connected. Second, metabolism can be represented as an enzyme-centric network where nodes (enzymes) producing a compound are connected with nodes consuming the same compound. In this work, we developed a novel method to collect and utilize characteristics embedded in enzyme-centric metabolic pathway networks, and generate the comparison trees for genomes of interest. These metabolic characteristics (MC) based comparison trees cluster the organisms according to their functional groups and reveal the relationship between different organisms from a physiological perspective yielding new insights about the organisms. We showed that where the 16S rrna tree failed to capture some major metabolic differences between the organisms, the MC based method efficiently captured the differences. Our simple and

2 accurate approach was able to capture the functional properties of different groups like pathogens, non-pathogens, and clusters them which were not seen in the traditional 16S rrna tree suggesting that there are differences in metabolic capabilities between the organisms. Step 2: We then calculated the frequency of the reaction types (EC: a.b) for all the selected species or strains. We then generated the histogram (Figure 2) in order to see the distribution of these reaction types. II. METHOD AND DATA A. Construction of enzym centric metabolic networks We constructed metabolic comparison trees by combining information about all the enzyme catalyzed metabolic reactions in bacterial species selected from the KEGG database [5] ( as of 30 th June 2008). This is currently one of the best available comprehensive databases for examining metabolic pathways, along with other more deeply annotated and specialized databases such as EcoCyc [6]. Bacteria were chosen for three reasons: 1. Bacterial metabolism is reasonably well understood and allows us to identify the roles of enzymes more clearly and reliably. 2. Bacteria allow us to get a better estimate of the phylogenetic profile and overall topological positions of individual enzyme because their phylogeny is widely studied. 3. Limiting the investigation to a single major group of organisms removes the confusion that might arise if representatives of several major organism types were examined, since each major group is likely to have metabolic characteristics peculiar to them. Figure 2. Frequencies of reaction types in the Vco-Vibrio Cholerae O1 metabolic network. Step 3: We generated a meta-table listing all the reactions that are involved in all metabolism along with its corresponding reaction type (EC: a.b) for all the bacterial species and strains from the KEGG database. Step 4: We created a list of reactions along with its reaction types for each organism or strains from the XML files in the KEGG database and the meta-table generated in the previous step. Figure 1.Aflow chart for the steps to construct distance matrix We developed a framework, shown in Figure 1, to get the distance matrix used for the construction of comparison trees. The steps are as follows: Step 1: We determined the number of different enzymes (identified by E.C. numbers) occurring in the selected bacterial species or strains from the xml files in the KEGG database. Figure 3. A snapshot of Glycolysis metabolic pathway from the KEGG database (a), and a schematic diagram for extracting a link between reaction types. Step 5: For each organism or strain, we then created a correlation matrix (Figure 4) containing the z-values (see

3 below) of the frequency of a reaction type (EC: a.b) followed by another reaction type (EC: x.y). We defined a link between two enzymes that participate in two successive reactions such that the product of one is substrate of another. If reaction R1 produces a compound A and A is the substrate of R2, a directed link between the EC numbers of R1 and R2 was established. In reversible reactions, a second link from the EC number of R2 to the EC number of R1 is added, as shown in Figure 3. Reference [9] discussed visualizing metabolic pathways as a useful framework for providing support for determination of gene functions. A z-score is a dimensionless quantity derived by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation. This conversion process is called normalizing. The z- score is calculated using the formula: insightful perspective on metabolism evolution [2]. In this work, we focus on using such information to compare genomes in larger context, as alternative to 16S rrna approach. Z = x - µ / σ, (1) where x is the raw score to be standardized, σ is the standard deviation of the population, and μ is the mean of the population. The score indicates the number of standard deviations an observation is above or below the mean. The quantity z is negative when the raw score is below the mean and positive when above. Step 6: Finally, the correlation matrices of all the organism and strains are used to create the distance matrix using the Pearson correlation coefficient. A correlation is a number that measures the degree of association between two variables (X and Y). A positive value for the correlation implies a positive association (large values of X tend to be associated with large values of Y and small values of X tend to be associated with small values of Y). A negative value for the correlation implies a negative or inverse association (large values of X tend to be associated with small values of Y and vice versa). The Pearson correlation coefficient is calculated as follows: The correlation coefficient is always between -1 and +1. The closer the correlation is to +/-1, the closer to a perfect linear relationship. Therefore, the Pearson correlation coefficient as defined above can measure to certain degree the similarity between the two species in terms of their metabolic characteristics. To conform to the requirements for constructing comparison trees for species, the similarity as measured by the Pearson correlation coefficient is converted, by subtracting one, to a measure of distance between the two species. A distance matrix is thus created for all species that are to be compared. Note that the reaction types frequencies and correlation can also be used for detecting duplicated genes, yielding (2) Figure 4. Correlation matrix of consecutive reaction types for a species, represented as a heat map. B. Comparison tree construction The final Comparison trees are generated in the following steps: Step 1: We used the program NEIGHBOR from the package PHYLIP [10], which implements the Neighbor- Joining method of Saitou and Nei (1987) and the UPGMA method of clustering. It constructs the tree but does not rearrange the nodes. The tree does not assume an evolutionary clock and in effect, it produces an unrooted tree. Step 2: Using the output tree description files from NEIGHBOR, we used RETREE [10] which is an interactive tree-plotting program. The final tree is created from this program. Figure 5. A flow chart of steps in generating the comparison tree from the distance matrix.

4 III. RESULTS A. Comparison of the bacteria Escherichia coli and its strains The bacteria Escherichia coli are widely studied intestinal bacteria and an ideal platform to understand cell genomics and metabolic capabilities. Because 16S rrna has several regions containing highly conserved sequences, slight differences in the 16S rrna sequences from different organisms can be used to determine their phylogenetic relationships. The 16S rrna tree for the 10 E. coli strains was constructed. The 16S ribosomal sequences are downloaded from the NCBI website in the FASTA format and are aligned using ClustalW multiple sequence alignment. The tree is then generated using the neighbor-joining algorithm and is rendered in TREEVIEW for visual depiction. It is clear from the tree, as shown in Figure 6, that the E. coli strains have highly similar 16S rrna sequences, as indicated by the tight clustering with short branch length. We have constructed a metabolic pathway component based comparison tree, as shown in Figure 7, by combining information about all enzyme catalyzing metabolic reactions from the KEGG database (30th June 2008) for our 10 E. coli representatives. TABLE 1. THE FREQUENCY COUNT OF FOUR REACTION TYPES FOR THREE E. COLI K12 STRAINS Reaction Types Eco Ecj ecd (EC:a.b) Figure 6. 16S rrnatree for E. coli strains with Yersinia perstis CO92 as outgroup. Figure 7. MC based comparison tree for E.coli strains with Y. perstis CO92 as outgroup. Figure 8. Histogram of reaction type frequencies for three E. coli. K-12 strains. We have shown in Table 1, that there are major differences between the reaction type (EC: a.b) frequencies for E. coli K12 DH10B in comparison to two other E. coli K-12 strains, MG1655 and W3110. It is worth pointing out that the strain DH10B has diverse reaction type frequency pattern in comparison to the strains MG1655 and W3110 (Table 1 and Figure 8). It is clear that the 16S rrna tree fails to capture the metabolic differences between these strains, while the MPC based tree does. Moreover, the MPC based method also successfully captures the functional properties of different pathogenic types. For example, our method grouped UPEC (Uropathogenic E. coli) strains together, the most common cause of non-hospital-acquired urinary tract infections. Similarly, Enterohemorrhagic E. coli (EHEC) strains, which are the primary cause of hemorrhagic colitis or bloody diarrhea, were clustered together. Some of these clustering of functional groups, although being largely

5 absent in the 16S rrna tree (Figure 6), suggests that there are differences in metabolic capabilities between the strains. B. Comparison of Pseudomonas, Psychrobacter, Acinetobacter and Shewanella oneidensis as an outgroup We next examined the phylogeny of Pseudomonas, Psychrobacter, and Acinetobacter, and Shewanella oneidensis was taken as an outgroup. The main reason to analyze these species is their diversity, which is clearly visible in the 16S rrna tree as shown in Figure 9. Unlike the 16S rrna tree in the phylogeny of E. coli (Figure 6), the tree has long branch lengths. Pseudomonas species are ubiquitous in nature and contain many pathogens that infect plants and humans. As these bacteria do not need any organic growth factor, they can grow under several different conditions. On the other hand, Psychrobacter are cold adapted organisms, some are from extreme low temperature environments. TABLE 2. THE REACTION TYPE FREQUENCY COUNT FOR FOUR ACINETOBACTER AND ON PSEUDOMONAS Reaction Types aci psa acb aby abm (EC:a.b) Acinetobacter is an aquatic organism that thrives in hospital environments and in hospitalized patients. They are highly versatile and omnipresent in nature. The metabolic diversity of these organisms and their strains thus makes them a very interesting group to study. The 16S rrna tree, as shown in Figure 9, is divided into 2 main clusters having Pseudomonas in one cluster. The other cluster is further divided into subclusters containing Psychrobacter and Acinetobacter. In comparison to the 16S rrna tree, our MC based tree, as shown in Figure 10, takes account of the genomic versatility of these organisms and their strains. One of the major observations in the MC based tree in comparison to the 16S rrna based tree is the clustering of Acinetobacter sp. ADP1 with other Pseudomonas and the grouping of Acinetobacter baumannii SDF and Acinetobacter baumannii AYE in a separate cluster. We justified the clustering of Acinetobacter baumannii with Pseudomonas stutzeri and Acinetobacter baumannii SDF with Acinetobacter baumannii AYE by analyzing the reaction type frequencies of these strains as shown in Table 3. TABLE 3. THE REACTION TYPE FREQUENCY COUNT FOR FOUR PSEUDOMONAS STRAINS Reaction Types ppu pfl pst psb (EC:a.b) We have also shown that Acinetobacter sp. ADP1, Acinetobacter baumannii SDF and Acinetobacter baumannii AYE are separately clustered away from the Pseudomonas. Another interesting observation is the clustering of Pseudomonas putida KT2440 with Pseudomonas fluorescens Pf-5 in the MC based tree. Both of these organisms have agricultural applications as biocontrol agents, especially the presence of many strains that have the ability to suppress agriculture pathogens. Clearly the traditional 16S rrna tree did not capture this functionality. IV. CONCLUSION In an attempt to classify and analyze organisms into an ordered scheme to better understand biological process and metabolic functional differences in organisms, we have successfully developed a framework to analyze the organisms based not only on their phylogeny but also the functional differences in their metabolism. We measured the distance between two species using the Pearson correlation coefficient. This score, Pearson correlation coefficient, was calculated using the information about all enzyme-catalyzed metabolic reactions in the bacterial species selected from the KEGG database. It was used to create the distance matrix from which we have built the enzyme centric metabolic comparison trees. We showed that 16S rrna tree clearly failed to capture some major metabolic differences between the organisms, while the MC based method efficiently captured the differences. Our simple and accurate approach was able to capture the functional properties of different groups like pathogens, non-pathogens, and clusters them which were not seen in the traditional 16S rrna tree suggesting that there are differences in metabolic capabilities between the organisms. ACKNOWLEDGMENT Research in EFB's laboratory is funded by UDRF and USDA NRI CSREES grants. The authors are grateful for comments made by the anonymous reviewers, particular for bringing to our attention a relevant paper by Lindroos and Andersson.

6 REFERENCES [1] T. Dandekar, S. Schuster, B. Snel, M. Huynan, and P.Bork, Pathway alignment: application of the comparativeanalysis of glycolytic enzymes, Biochme. J, vol. 43, pp , [2] J.J. Diaz-Mejia, E. Perez-Rueda, and L. Segovia, A network perspective on the evolution of metabolism by gene duplication, Genome Biology, 8:R26, [3] W.M. Fitch, Construction of phylogenetic trees, Science, vol. 155, pp , [4] M.A. Huynan and P. Bork, Measuring genome evolution, Proc. Natl Acad. Sci. USA, Vol. 95, pp , [5] M. Kanehisa, S. Goto, S. Kawashima, A. Nakaya, The KEGG databases at GenomeNet, Nucleic Acids Res. Vol. 30, pp , [6] Karp,P.D. Pathway databases: a case study in computational symbolic theories. Science, 293, , [7] A.L. Lehninger, D.L. Nelson, and M.M. Cox, Principles of Biochemistry. 2 nd edition, Worth Publishers, Inc, [8] L. Liao, S. Kim, and J-F. Tomb, Genome comparisons bases on profiles of metabolic pathways, Proc. The Six International Conference on Knowledge-Based Intelligent Information & Engineering Systems (KES 2002), pp , September 2002, Crema, Italy. [9] H. Lindroos, S.G.E. Andersson, Visualizing metabolic pathways: comparative genomics and expression analysis, Proceedings of the IEEE, Vol. 90, pp , [10] [11] S. Schuster, D.A. Fell, T. Dandekar, A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks, Nat. Biotechnol., Vol. 18, pp , 2000.doi: / [12] Studier and Keppler, A note on the neighbor-joining algorithm of Saiton and Nei, Moclecular Biology and Evolution, Vol. 5, pp , [13] S. Zhang, L. Liao, J-F. Tomb, and J.T.L. Wang, Clustering and classifying enzymes in metabolic pathways: some preliminary results, Proc. ACM SIGKDD Workshop on Data Mining in Bioinformatics, pp , Edmonton, Canada, [14] W.C. Hwang, W.H. Lin, A.J. Davis, F. Jordan, H.T. Yang, and M.J. Hwang, A network perspective on the toplogical importance of enzymes and their phylogenetic conservation, BMC Bioinformatics, 200, Vol. 8, pp. 212doi: / Figure 9. 16S rrna tree for Pseudomonas, Psychrobacter, Acinetobacter with Shewanella oneidensis as outgroup.

7 Figure 10. The MC based comparison tree for Pseudomonas, Psychrobacter, Acinetobacber and Shewanella oneidensis as an outgroup.

Supplementary Information

Supplementary Information Supplementary Information For the article"comparable system-level organization of Archaea and ukaryotes" by J. Podani, Z. N. Oltvai, H. Jeong, B. Tombor, A.-L. Barabási, and. Szathmáry (reference numbers

More information

METABOLIC PATHWAY PREDICTION/ALIGNMENT

METABOLIC PATHWAY PREDICTION/ALIGNMENT COMPUTATIONAL SYSTEMIC BIOLOGY METABOLIC PATHWAY PREDICTION/ALIGNMENT Hofestaedt R*, Chen M Bioinformatics / Medical Informatics, Technische Fakultaet, Universitaet Bielefeld Postfach 10 01 31, D-33501

More information

Phylogenetic analyses. Kirsi Kostamo

Phylogenetic analyses. Kirsi Kostamo Phylogenetic analyses Kirsi Kostamo The aim: To construct a visual representation (a tree) to describe the assumed evolution occurring between and among different groups (individuals, populations, species,

More information

Comparative Analysis of Nitrogen Assimilation Pathways in Pseudomonas using Hypergraphs

Comparative Analysis of Nitrogen Assimilation Pathways in Pseudomonas using Hypergraphs Comparative Analysis of Nitrogen Assimilation Pathways in Pseudomonas using Hypergraphs Aziz Mithani, Arantza Rico, Rachel Jones, Gail Preston and Jotun Hein mithani@stats.ox.ac.uk Department of Statistics

More information

Introduction to Bioinformatics Integrated Science, 11/9/05

Introduction to Bioinformatics Integrated Science, 11/9/05 1 Introduction to Bioinformatics Integrated Science, 11/9/05 Morris Levy Biological Sciences Research: Evolutionary Ecology, Plant- Fungal Pathogen Interactions Coordinator: BIOL 495S/CS490B/STAT490B Introduction

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information -

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information - Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes - Supplementary Information - Martin Bartl a, Martin Kötzing a,b, Stefan Schuster c, Pu Li a, Christoph Kaleta b a

More information

Introduction to the SNP/ND concept - Phylogeny on WGS data

Introduction to the SNP/ND concept - Phylogeny on WGS data Introduction to the SNP/ND concept - Phylogeny on WGS data Johanne Ahrenfeldt PhD student Overview What is Phylogeny and what can it be used for Single Nucleotide Polymorphism (SNP) methods CSI Phylogeny

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

ATLAS of Biochemistry

ATLAS of Biochemistry ATLAS of Biochemistry USER GUIDE http://lcsb-databases.epfl.ch/atlas/ CONTENT 1 2 3 GET STARTED Create your user account NAVIGATE Curated KEGG reactions ATLAS reactions Pathways Maps USE IT! Fill a gap

More information

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species

PGA: A Program for Genome Annotation by Comparative Analysis of. Maximum Likelihood Phylogenies of Genes and Species PGA: A Program for Genome Annotation by Comparative Analysis of Maximum Likelihood Phylogenies of Genes and Species Paulo Bandiera-Paiva 1 and Marcelo R.S. Briones 2 1 Departmento de Informática em Saúde

More information

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016

Boolean models of gene regulatory networks. Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016 Boolean models of gene regulatory networks Matthew Macauley Math 4500: Mathematical Modeling Clemson University Spring 2016 Gene expression Gene expression is a process that takes gene info and creates

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Integration of functional genomics data

Integration of functional genomics data Integration of functional genomics data Laboratoire Bordelais de Recherche en Informatique (UMR) Centre de Bioinformatique de Bordeaux (Plateforme) Rennes Oct. 2006 1 Observations and motivations Genomics

More information

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016 Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,

More information

Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction Networks

Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction Networks 22 International Conference on Environment Science and Engieering IPCEE vol.3 2(22) (22)ICSIT Press, Singapoore Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction

More information

Comparative genomics: Overview & Tools + MUMmer algorithm

Comparative genomics: Overview & Tools + MUMmer algorithm Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first

More information

GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways

GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways GOSAP: Gene Ontology Based Semantic Alignment of Biological Pathways Jonas Gamalielsson and Björn Olsson Systems Biology Group, Skövde University, Box 407, Skövde, 54128, Sweden, [jonas.gamalielsson][bjorn.olsson]@his.se,

More information

2 Genome evolution: gene fusion versus gene fission

2 Genome evolution: gene fusion versus gene fission 2 Genome evolution: gene fusion versus gene fission Berend Snel, Peer Bork and Martijn A. Huynen Trends in Genetics 16 (2000) 9-11 13 Chapter 2 Introduction With the advent of complete genome sequencing,

More information

Integration of Omics Data to Investigate Common Intervals

Integration of Omics Data to Investigate Common Intervals 2011 International Conference on Bioscience, Biochemistry and Bioinformatics IPCBEE vol.5 (2011) (2011) IACSIT Press, Singapore Integration of Omics Data to Investigate Common Intervals Sébastien Angibaud,

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

86 Part 4 SUMMARY INTRODUCTION

86 Part 4 SUMMARY INTRODUCTION 86 Part 4 Chapter # AN INTEGRATION OF THE DESCRIPTIONS OF GENE NETWORKS AND THEIR MODELS PRESENTED IN SIGMOID (CELLERATOR) AND GENENET Podkolodny N.L. *1, 2, Podkolodnaya N.N. 1, Miginsky D.S. 1, Poplavsky

More information

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms

More information

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B

Microbial Diversity and Assessment (II) Spring, 2007 Guangyi Wang, Ph.D. POST103B Microbial Diversity and Assessment (II) Spring, 007 Guangyi Wang, Ph.D. POST03B guangyi@hawaii.edu http://www.soest.hawaii.edu/marinefungi/ocn403webpage.htm General introduction and overview Taxonomy [Greek

More information

Introduction to Evolutionary Concepts

Introduction to Evolutionary Concepts Introduction to Evolutionary Concepts and VMD/MultiSeq - Part I Zaida (Zan) Luthey-Schulten Dept. Chemistry, Beckman Institute, Biophysics, Institute of Genomics Biology, & Physics NIH Workshop 2009 VMD/MultiSeq

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

RGP finder: prediction of Genomic Islands

RGP finder: prediction of Genomic Islands Training courses on MicroScope platform RGP finder: prediction of Genomic Islands Dynamics of bacterial genomes Gene gain Horizontal gene transfer Gene loss Deletion of one or several genes Duplication

More information

Supplementary material to Whitney, K. D., B. Boussau, E. J. Baack, and T. Garland Jr. in press. Drift and genome complexity revisited. PLoS Genetics.

Supplementary material to Whitney, K. D., B. Boussau, E. J. Baack, and T. Garland Jr. in press. Drift and genome complexity revisited. PLoS Genetics. Supplementary material to Whitney, K. D., B. Boussau, E. J. Baack, and T. Garland Jr. in press. Drift and genome complexity revisited. PLoS Genetics. Tree topologies Two topologies were examined, one favoring

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr Introduction to Bioinformatics Shifra Ben-Dor Irit Orr Lecture Outline: Technical Course Items Introduction to Bioinformatics Introduction to Databases This week and next week What is bioinformatics? A

More information

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS Masatoshi Nei" Abstract: Phylogenetic trees: Recent advances in statistical methods for phylogenetic reconstruction and genetic diversity analysis were

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

V14 Graph connectivity Metabolic networks

V14 Graph connectivity Metabolic networks V14 Graph connectivity Metabolic networks In the first half of this lecture section, we use the theory of network flows to give constructive proofs of Menger s theorem. These proofs lead directly to algorithms

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

V19 Metabolic Networks - Overview

V19 Metabolic Networks - Overview V19 Metabolic Networks - Overview There exist different levels of computational methods for describing metabolic networks: - stoichiometry/kinetics of classical biochemical pathways (glycolysis, TCA cycle,...

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein

More information

V14 extreme pathways

V14 extreme pathways V14 extreme pathways A torch is directed at an open door and shines into a dark room... What area is lighted? Instead of marking all lighted points individually, it would be sufficient to characterize

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Biological Pathways Representation by Petri Nets and extension

Biological Pathways Representation by Petri Nets and extension Biological Pathways Representation by and extensions December 6, 2006 Biological Pathways Representation by and extension 1 The cell Pathways 2 Definitions 3 4 Biological Pathways Representation by and

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting. Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction

More information

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling

Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling Lethality and centrality in protein networks Cell biology traditionally identifies proteins based on their individual actions as catalysts, signaling molecules, or building blocks of cells and microorganisms.

More information

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology

More information

Evaluation of the relative contribution of each STRING feature in the overall accuracy operon classification

Evaluation of the relative contribution of each STRING feature in the overall accuracy operon classification Evaluation of the relative contribution of each STRING feature in the overall accuracy operon classification B. Taboada *, E. Merino 2, C. Verde 3 blanca.taboada@ccadet.unam.mx Centro de Ciencias Aplicadas

More information

Unsupervised Learning in Spectral Genome Analysis

Unsupervised Learning in Spectral Genome Analysis Unsupervised Learning in Spectral Genome Analysis Lutz Hamel 1, Neha Nahar 1, Maria S. Poptsova 2, Olga Zhaxybayeva 3, J. Peter Gogarten 2 1 Department of Computer Sciences and Statistics, University of

More information

Honor pledge: I have neither given nor received unauthorized aid on this test. Name :

Honor pledge: I have neither given nor received unauthorized aid on this test. Name : Midterm Exam #1 MB 451 : Microbial Diversity Honor pledge: I have neither given nor received unauthorized aid on this test. Signed : Date : Name : 1. What are the three primary evolutionary branches of

More information

Chapter 19 Organizing Information About Species: Taxonomy and Cladistics

Chapter 19 Organizing Information About Species: Taxonomy and Cladistics Chapter 19 Organizing Information About Species: Taxonomy and Cladistics An unexpected family tree. What are the evolutionary relationships among a human, a mushroom, and a tulip? Molecular systematics

More information

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi) Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction Lesser Tenrec (Echinops telfairi) Goals: 1. Use phylogenetic experimental design theory to select optimal taxa to

More information

New Results on Energy Balance Analysis of Metabolic Networks

New Results on Energy Balance Analysis of Metabolic Networks New Results on Energy Balance Analysis of Metabolic Networks Qinghua Zhou 1, Simon C.K. Shiu 2,SankarK.Pal 3,andYanLi 1 1 College of Mathematics and Computer Science, Hebei University, Baoding City 071002,

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

Gene Network Science Diagrammatic Cell Language and Visual Cell

Gene Network Science Diagrammatic Cell Language and Visual Cell Gene Network Science Diagrammatic Cell Language and Visual Cell Mr. Tan Chee Meng Scientific Programmer, System Biology Group, Bioinformatics Institute Overview Introduction Why? Challenges Diagrammatic

More information

Introduction to Bioinformatics Online Course: IBT

Introduction to Bioinformatics Online Course: IBT Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec1 Building a Multiple Sequence Alignment Learning Outcomes 1- Understanding Why multiple

More information

BIOINFORMATICS: An Introduction

BIOINFORMATICS: An Introduction BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family Jieming Shen 1,2 and Hugh B. Nicholas, Jr. 3 1 Bioengineering and Bioinformatics Summer

More information

Horizontal transfer and pathogenicity

Horizontal transfer and pathogenicity Horizontal transfer and pathogenicity Victoria Moiseeva Genomics, Master on Advanced Genetics UAB, Barcelona, 2014 INDEX Horizontal Transfer Horizontal gene transfer mechanisms Detection methods of HGT

More information

Phylogenetic Tree Generation using Different Scoring Methods

Phylogenetic Tree Generation using Different Scoring Methods International Journal of Computer Applications (975 8887) Phylogenetic Tree Generation using Different Scoring Methods Rajbir Singh Associate Prof. & Head Department of IT LLRIET, Moga Sinapreet Kaur Student

More information

Comparative genomics of gene families in relation with metabolic pathways for gene candidates highlighting

Comparative genomics of gene families in relation with metabolic pathways for gene candidates highlighting Comparative genomics of gene families in relation with metabolic pathways for gene candidates highlighting Delphine Larivière & David Couvin Under the supervision of Dominique This, Jean-François Dufayard

More information

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Supplementary Materials for

Supplementary Materials for advances.sciencemag.org/cgi/content/full/1/8/e1500527/dc1 Supplementary Materials for A phylogenomic data-driven exploration of viral origins and evolution The PDF file includes: Arshan Nasir and Gustavo

More information

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest

More information

Chapter 6- An Introduction to Metabolism*

Chapter 6- An Introduction to Metabolism* Chapter 6- An Introduction to Metabolism* *Lecture notes are to be used as a study guide only and do not represent the comprehensive information you will need to know for the exams. The Energy of Life

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Networks & pathways. Hedi Peterson MTAT Bioinformatics

Networks & pathways. Hedi Peterson MTAT Bioinformatics Networks & pathways Hedi Peterson (peterson@quretec.com) MTAT.03.239 Bioinformatics 03.11.2010 Networks are graphs Nodes Edges Edges Directed, undirected, weighted Nodes Genes Proteins Metabolites Enzymes

More information

Outline Classes of diversity measures. Species Divergence and the Measurement of Microbial Diversity. How do we describe and compare diversity?

Outline Classes of diversity measures. Species Divergence and the Measurement of Microbial Diversity. How do we describe and compare diversity? Species Divergence and the Measurement of Microbial Diversity Cathy Lozupone University of Colorado, Boulder. Washington University, St Louis. Outline Classes of diversity measures α vs β diversity Quantitative

More information

Cluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002

Cluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002 Cluster Analysis of Gene Expression Microarray Data BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002 1 Data representations Data are relative measurements log 2 ( red

More information

Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.

Microbes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible. Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional

More information

Microbial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; "fast- clock" molecules for fine-structure.

Microbial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; fast- clock molecules for fine-structure. Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

E. coli b4226 (ppa) and Mrub_0258 are orthologs; E. coli b2501 (ppk) and Mrub_1198 are orthologs. Brandon Wills

E. coli b4226 (ppa) and Mrub_0258 are orthologs; E. coli b2501 (ppk) and Mrub_1198 are orthologs. Brandon Wills E. coli b4226 (ppa) and Mrub_0258 are orthologs; E. coli b2501 (ppk) and Mrub_1198 are orthologs Brandon Wills ppa gene inorganic pyrophosphatase Structure/Function: 175 amino acids 1 single domain Cytoplasm

More information

MICROBIAL BIOCHEMISTRY BIOT 309. Dr. Leslye Johnson Sept. 30, 2012

MICROBIAL BIOCHEMISTRY BIOT 309. Dr. Leslye Johnson Sept. 30, 2012 MICROBIAL BIOCHEMISTRY BIOT 309 Dr. Leslye Johnson Sept. 30, 2012 Phylogeny study of evoluhonary relatedness among groups of organisms (e.g. species, populahons), which is discovered through molecular

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information

Microbiology Helmut Pospiech

Microbiology Helmut Pospiech Microbiology http://researchmagazine.uga.edu/summer2002/bacteria.htm 05.04.2018 Helmut Pospiech The Species Concept in Microbiology No universally accepted concept of species for prokaryotes Current definition

More information

The EcoCyc Database. January 25, de Nitrógeno, UNAM,Cuernavaca, A.P. 565-A, Morelos, 62100, Mexico;

The EcoCyc Database. January 25, de Nitrógeno, UNAM,Cuernavaca, A.P. 565-A, Morelos, 62100, Mexico; The EcoCyc Database Peter D. Karp, Monica Riley, Milton Saier,IanT.Paulsen +, Julio Collado-Vides + Suzanne M. Paley, Alida Pellegrini-Toole,César Bonavides ++, and Socorro Gama-Castro ++ January 25, 2002

More information

' Institute for Biology, Humboldt University Berlin, Germany

' Institute for Biology, Humboldt University Berlin, Germany METABOLIC SYNERGY: INCREASING BIOSYNTHETIC CAPABILITIES BY NETWORK COOPERATION NILS CHRISTIAN' nils.christianqphysik.fu-berlin.de THOMAS HANDORF' thomas.handorfc3physik.hu-berlin.de OLIVER EBENHOH~ ebenhoehompimp-golm.mpg.de

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

Microbiome: 16S rrna Sequencing 3/30/2018

Microbiome: 16S rrna Sequencing 3/30/2018 Microbiome: 16S rrna Sequencing 3/30/2018 Skills from Previous Lectures Central Dogma of Biology Lecture 3: Genetics and Genomics Lecture 4: Microarrays Lecture 12: ChIP-Seq Phylogenetics Lecture 13: Phylogenetics

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

In order to compare the proteins of the phylogenomic matrix, we needed a similarity

In order to compare the proteins of the phylogenomic matrix, we needed a similarity Similarity Matrix Generation In order to compare the proteins of the phylogenomic matrix, we needed a similarity measure. Hamming distances between phylogenetic profiles require the use of thresholds for

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Niche specific amino acid features within the core genes of the genus Shewanella

Niche specific amino acid features within the core genes of the genus Shewanella www.bioinformation.net Hypothesis Volume 8(19) Niche specific amino acid features within the core genes of the genus Shewanella Rachana Banerjee* & Subhasis Mukhopadhyay Department of Biophysics, Molecular

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Phylogenetic Analysis of Molecular Interaction Networks 1

Phylogenetic Analysis of Molecular Interaction Networks 1 Phylogenetic Analysis of Molecular Interaction Networks 1 Mehmet Koyutürk Case Western Reserve University Electrical Engineering & Computer Science 1 Joint work with Sinan Erten, Xin Li, Gurkan Bebek,

More information

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell

More information

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Written Exam 15 December Course name: Introduction to Systems Biology Course no Technical University of Denmark Written Exam 15 December 2008 Course name: Introduction to Systems Biology Course no. 27041 Aids allowed: Open book exam Provide your answers and calculations on separate

More information