Comparative Genomics Background & Strategy. Faction 2

Size: px
Start display at page:

Download "Comparative Genomics Background & Strategy. Faction 2"

Transcription

1 Comparative Genomics Background & Strategy Faction 2

2 Overview Introduction to comparative genomics Salmonella enterica subsp. enterica serovar Heidelberg Comparative Genomics Faction 2 Objectives Genomic Elements Phylogenetic Analysis Whole Genome Visualizers Comparative Genomics Functional Implications Proposed Workflow

3 Introduction to comparative genomics Comparison of multiple genomes to discover similarities and differences: Gains, losses, and rearrangements of sequences Syntenic loci across organisms, relatedness to different species Factors affecting virulence, survivability, antimicrobial resistance, etc.

4 Goals Develop a typing method utilizing data from WGS capable of distinguishing isolates from the same and different outbreaks Find elements responsible for differences between the sporadic and outbreak groups

5 Salmonella enterica subsp. enterica serovar Heidelberg Salmonella enterica is a gram-negative, rod shaped bacteria that can infect many different hosts Facultative intracellular pathogen Divided into 6 subspecies (enterica (I), salamae (II), arizonae (IIIa), diarizonae (IIIb), houtenae (IV) and indica (VI)) Samonella enterica subsp. enterica has 2610 different serovars Serovars are characterized by two surface antigens: the flagellar H antigen and the oligosaccharide O antigen Antigenic Formula for serovar Heidelberg: 1,4,[5],12:r:1,2 7th most common serovar isolated from humans in the U.S This serovar has caused multiple outbreaks in U.S linked to poultry

6 Genomic Elements Genome Size ranging from 4.73 to 4.98 Mb GC% content of 52.1% Salmonella Pathogenicity Islands: Segments of DNA that encode for genes that are virulence factors and are required for the infection of the host Display higher AT content than rest of the genome Can play a role in virulence, host specification, and bacterial invasiveness from host immune system Antimicrobial resistance genes Possible elements for HGT Mobilome: Mobile genetic elements are important for bacterial adaptation and evolution, allowing the transfer of genes from different species in the same environment (i.e. Human gut) Plasmids Phages

7 Faction 2 Objectives:

8 Phylogenetic Analysis Analysis to determine evolutionary history of individuals or groups of organisms Used to construct phylogenetic trees which infer the evolutionary ancestry of a set of individuals or of genes Global Based: Phylogenetic trees reveal the level of similarities between sequences in an alignment Whole-genome Alignment Marker Based: MLST / WG-MLST SNPs Pan/core genes

9 Phylogenetic Analysis Whole Genome Alignment: RAxML MASH MLST: Classic MLST + Whole genome MLST (wgmlst) SNP Analysis: CSI phylogeny

10 RAxML Randomized Axelerated Maximum Likelihood Program for phylogenetic analyses of large datasets using maximum likelihood Maximum Likelihood in terms of phylogenetic trees: Examines the space of all possible trees given a dataset Assigns a likelihood based on how likely the tree can reproduce the observed data Bootstrapping: tests the sampling error of the tree by running pseudoreplications of the original data Probability-like score Can handle whole-genome alignment on large datasets to generate whole-genome maximum likelihood phylogenetic trees

11

12

13

14

15

16

17

18

19

20

21

22 snptree Leekitcharoenphon, Pimlapas et al (2012)

23 snptree.out of order

24 CSI Phylogeny (Call SNPs & Infer Phylogeny) Finds SNPs in the same manner as snptree Strict sorting of SNPs Requires all SNPs to be significant Z-score > 1.96 Z=(X -Y)/ (X+Y) X is the number of reads of the most common nucleotide at that position Y is the number of reads with any other nucleotide Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O (2014)

25 Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O (2014)

26 Whole Genome Visualization Mauve BRIG

27 BRIG: Blast Ring Image Generator Creates circular images of genomes Allows comparison of many genomes at once Utilizes Blast+ for genome alignment Allows addition of custom annotations With user-defined set of genes as input, BRIG can display: gene presence gene absence gene truncation sequence variation Works with set of complete genomes, draft genomes or raw, unassembled sequence data

28 Multi-alignment and visualization tool for large structure differences. Original Mauve algorithm Seeds -> Anchors -> LCBs(local collinear blocks) -> Gapped global alignment for each LCB Many problems(exp, parameters and thresholds) Progressive Mauve algorithm Pairwise genome distance -> Form guide tree and decide thresholds -> Alignment -> Refinement Consume more time and memory(too slow for over genomes)

29 Faction 2 Objectives:

30 Comparative Genomics Pangenomics: Roary, BPGA, PGAP Genomic Elements Islands: (Pathogenicity Islands and Resistance Islands) with GIpsy Prophage: PHAST BRIG

31 Pan-Genome Analysis Pan Genome represents entire gene set of all strains of a species. (

32 Pan Genome Analysis Tools > 30 Tools Available Tools differ in: -Program Algorithm : Different approaches affect the sensitivity, accuracy, speed of the Tool -Output Features

33 Basic Algorithm Protein seq. Alignment Clustering Orthologous Genes absence of genes in individual strains Presence/ PanOCT and PGAP: require an all-against-all comparison using BLAST Roary, BPGA : Clusters relative prokaryotic isolates before BLAST

34 (Page et al. 2015)

35 (Page et al. 2015)

36 Genetic Islands Salmonella Pathogenicity Islands: Regions of the genome containing genes that encode for proteins required for virulence. Located both on the chromosome and plasmids Allow for intracellular colonization and pathogenicity Can be acquired through HGT (cross species) GIPSy: Genomic Island Prediction Software Version 1.1.2

37 GIPSy Software for the prediction of genomic islands into the classes: Predicts GEIs based on commonly the shared feature Pathogenicity islands, resistance islands, metabolic islands and symbiotic islands Genomic signature deviation (GC content and codon usages), presence of transposase genes, virulence and antibiotic genes, flanking trna genes, and cross-species genes. This is done in a 8 step process Perform on our sporadic isolates to analyze presence/absence of specific SPIs and compare to known outbreak isolates and known metadata

38 GIPSy Step 1: Load query genomes Step 2: GC Content Analysis Step 3: Codon usage Analysis Step 4: Search for transposase genes using HMMer Step 5: Search for specific factors (virulence, resistance) using blastp Step 6: Visualize amino acid content Step 7:tRNA prediction using HMMer results Step 8:Visualization of Pathogenicity Islands

39 Plasmids Analysis with Plasmid Profiler A plasmid comparative analysis pipeline. Steps Inputs: WGS short reads in FastQ format, plasmid sequence database, replicon sequence and gene of interest database. Use KAT to remove unrepresented plasmid sequences to create an individualized plasmid database per sample. Use SRST2 to identify putative plasmid hits from the individual databases. Use Blast to identify the incompatibility groups and genes of interest found in the sequences found at last step. Use its R package to visualize.

40 Plasmid Profiler

41 Prophage identification with PHAST Mobilome: prophages, plasmids, and other mobile genetic elements Horizontal gene transfer of mobile elements can increase microbial resistance Purpose - use prophage identification to differentiate between line list metadata PHAST (PHAge Search Tool): Accepts raw DNA sequence data or annotated GenBank formatted data Locates, annotates, and displays prophage sequences and features ~3 minutes for typical bacterial genome Zhou (2011)

42 Zhou (2011)

43 PHAST (2011) Arndt (2016) PHASTER (2016) > 4.3x faster analysis than PHAST > Able to handle multiple queries > improvement of genome visualization tools

44 Faction 2 Objectives:

45 Pathway Analysis In-depth and contextualized findings to help understand the mechanisms of disease in question Identification of genes and proteins associated with the etiology of a specific disease Prediction of drug targets Understand how to intervene therapeutically in disease processes Data integration: integrate diverse biological information Functional discovery: assign function to genes Conduct targeted literature searches

46 Pathway Analysis Tools/Databases DAVID (david.abcc.ncifcrf.gov) KEGG ( Ingenuity Pathway Analysis ( GeneGo/MetaCore ( GenMAPP (www. genmapp.com) BioCyc ( Pubgene ( PANTHER (www. pantherdb.org)

47

48 Proposed Workflow

49 References Leekitcharoenphon, P., Kaas, R. S., Thomsen, M. C. F., Friis, C., Rasmussen, S., & Aarestrup, F. M. (2012). snptree - a web-server to identify and construct SNP trees from whole genome sequence data. BMC Genomics, 13(Suppl 7), S6. Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O (2014) Solving the Problem of Comparing Whole Bacterial Genomes across Different Sequencing Platforms. PLoS ONE 9(8): e doi: /journal.pone Ondov, Brian D, et al. Mash: fast genome and metagenome distance estimation using MinHash. Genome Biology (2016): DOI: /s x Maiden, Martin CJ, et al. MLST revisited: the gene-by-gene approach to bacterial genomics. Nat Rev Microbiol (2013): Maiden, Martin CJ, et al. "Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms." Proceedings of the National Academy of Sciences 95.6 (1998): Gupta, Anuj, et al. stringmlst: a fast k-mer based tool for multilocus sequence typing. Bioinformatics (2017) 33 (1): Liu, Yi-Yen, et al. Construction of a Pan-Genome Allele Database of Salmonella enterica Serovar Enteritidis for Molecular Subtyping and Disease Cluster Identification. Front. Microbiol (2016), doi: /fmicb Zhou, You, Yongjie Liang, Karlene H. Lynch, Jonathan J. Dennis, and David S. Wishart. "PHAST: A Fast Phage Search Tool." Nucleic Acids Research. Oxford University Press, 14 June Web. 28 Mar Arndt, David, Jason R. Grant, Ana Marcu, Tanvir Sajed, Allison Pon, Yongjie Liang, and David S. Wishart. "PHASTER: A Better, Faster Version of the PHAST Phage Search Tool." Nucleic Acids Research 44.W1 (2016): n. pag. Web. Zhulin IB. Databases for Microbiologists. Margolin WW, ed. Journal of Bacteriology. 2015;197(15): doi: /jb Viswanathan GA, Seto J, Patil S, Nudelman G, Sealfon SC (2008) Getting Started in Biological Pathway Construction and Analysis. PLoS Comput Biol 4(2): e16. doi: /journal.pcbi A. Stamatakis: "RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis of Large Phylogenies". Bioinformatics, 2014

50 Thank You

Introduction to the SNP/ND concept - Phylogeny on WGS data

Introduction to the SNP/ND concept - Phylogeny on WGS data Introduction to the SNP/ND concept - Phylogeny on WGS data Johanne Ahrenfeldt PhD student Overview What is Phylogeny and what can it be used for Single Nucleotide Polymorphism (SNP) methods CSI Phylogeny

More information

Outline. I. Methods. II. Preliminary Results. A. Phylogeny Methods B. Whole Genome Methods C. Horizontal Gene Transfer

Outline. I. Methods. II. Preliminary Results. A. Phylogeny Methods B. Whole Genome Methods C. Horizontal Gene Transfer Comparative Genomics Preliminary Results April 4, 2016 Juan Castro, Aroon Chande, Cheng Chen, Evan Clayton, Hector Espitia, Alli Gombolay, Walker Gussler, Ken Lee, Tyrone Lee, Hari Prasanna, Carlos Ruiz,

More information

Comparative genomics: Overview & Tools + MUMmer algorithm

Comparative genomics: Overview & Tools + MUMmer algorithm Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first

More information

RGP finder: prediction of Genomic Islands

RGP finder: prediction of Genomic Islands Training courses on MicroScope platform RGP finder: prediction of Genomic Islands Dynamics of bacterial genomes Gene gain Horizontal gene transfer Gene loss Deletion of one or several genes Duplication

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)

More information

Genômica comparativa. João Carlos Setubal IQ-USP outubro /5/2012 J. C. Setubal

Genômica comparativa. João Carlos Setubal IQ-USP outubro /5/2012 J. C. Setubal Genômica comparativa João Carlos Setubal IQ-USP outubro 2012 11/5/2012 J. C. Setubal 1 Comparative genomics There are currently (out/2012) 2,230 completed sequenced microbial genomes publicly available

More information

Alternative tools for phylogeny. Identification of unique core sequences

Alternative tools for phylogeny. Identification of unique core sequences Alternative tools for phylogeny Identification of unique core sequences Workshop on Whole Genome Sequencing and Analysis, 19-21 Mar. 2018 Learning objective: After this lecture you should be able to account

More information

Comparative Genomics Background and Strategies. Nitya Sharma, Emily Rogers, Kanika Arora, Zhiming Zhao, Yun Gyeong Lee

Comparative Genomics Background and Strategies. Nitya Sharma, Emily Rogers, Kanika Arora, Zhiming Zhao, Yun Gyeong Lee Comparative Genomics Background and Strategies Nitya Sharma, Emily Rogers, Kanika Arora, Zhiming Zhao, Yun Gyeong Lee Introduction Why comparative genomes? h"p://www.ensembl.org/info/about/species.html

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

MiGA: The Microbial Genome Atlas

MiGA: The Microbial Genome Atlas December 12 th 2017 MiGA: The Microbial Genome Atlas Jim Cole Center for Microbial Ecology Dept. of Plant, Soil & Microbial Sciences Michigan State University East Lansing, Michigan U.S.A. Where I m From

More information

Horizontal transfer and pathogenicity

Horizontal transfer and pathogenicity Horizontal transfer and pathogenicity Victoria Moiseeva Genomics, Master on Advanced Genetics UAB, Barcelona, 2014 INDEX Horizontal Transfer Horizontal gene transfer mechanisms Detection methods of HGT

More information

FUNCTION ANNOTATION PRELIMINARY RESULTS

FUNCTION ANNOTATION PRELIMINARY RESULTS FUNCTION ANNOTATION PRELIMINARY RESULTS FACTION I KAI YUAN KALYANI PATANKAR KIERA BERGER CAMILA MEDRANO HUBERT PAN JUNKE WANG YANXI CHEN AJAY RAMAKRISHNAN MRUNAL DEHANKAR OVERVIEW Introduction Previous

More information

The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome

The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome Dr. Dirk Gevers 1,2 1 Laboratorium voor Microbiologie 2 Bioinformatics & Evolutionary Genomics The bacterial species in the genomic era CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG

More information

Other tools for typing and phylogeny

Other tools for typing and phylogeny Other tools for typing and phylogeny Workshop on Whole Genome Sequencing and Analysis, 27-29 Mar. 2017 Learning objective: After this lecture you should be able to account for tools for typing Salmonella

More information

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17. Genetic Variation: The genetic substrate for natural selection What about organisms that do not have sexual reproduction? Horizontal Gene Transfer Dr. Carol E. Lee, University of Wisconsin In prokaryotes:

More information

Whole Genome Alignment. Adam Phillippy University of Maryland, Fall 2012

Whole Genome Alignment. Adam Phillippy University of Maryland, Fall 2012 Whole Genome Alignment Adam Phillippy University of Maryland, Fall 2012 Motivation cancergenome.nih.gov Breast cancer karyotypes www.path.cam.ac.uk Goal of whole-genome alignment } For two genomes, A and

More information

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB I519 Introduction to Bioinformatics, 2015 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism

More information

CRISPR-SeroSeq: A Developing Technique for Salmonella Subtyping

CRISPR-SeroSeq: A Developing Technique for Salmonella Subtyping Department of Biological Sciences Seminar Blog Seminar Date: 3/23/18 Speaker: Dr. Nikki Shariat, Gettysburg College Title: Probing Salmonella population diversity using CRISPRs CRISPR-SeroSeq: A Developing

More information

SPECIES OF ARCHAEA ARE MORE CLOSELY RELATED TO EUKARYOTES THAN ARE SPECIES OF PROKARYOTES.

SPECIES OF ARCHAEA ARE MORE CLOSELY RELATED TO EUKARYOTES THAN ARE SPECIES OF PROKARYOTES. THE TERMS RUN AND TUMBLE ARE GENERALLY ASSOCIATED WITH A) cell wall fluidity. B) cell membrane structures. C) taxic movements of the cell. D) clustering properties of certain rod-shaped bacteria. A MAJOR

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Whole genome sequencing (WGS) - there s a new tool in town. Henrik Hasman DTU - Food

Whole genome sequencing (WGS) - there s a new tool in town. Henrik Hasman DTU - Food Whole genome sequencing (WGS) - there s a new tool in town Henrik Hasman DTU - Food Welcome to the NGS world TODAY Welcome Introduction to Next Generation Sequencing DNA purification (Hands-on) Lunch (Sandwishes

More information

Whole genome sequencing (WGS) as a tool for monitoring purposes. Henrik Hasman DTU - Food

Whole genome sequencing (WGS) as a tool for monitoring purposes. Henrik Hasman DTU - Food Whole genome sequencing (WGS) as a tool for monitoring purposes Henrik Hasman DTU - Food The Challenge Is to: Continue to increase the power of surveillance and diagnostic using molecular tools Develop

More information

7 Multiple Genome Alignment

7 Multiple Genome Alignment 94 Bioinformatics I, WS /3, D. Huson, December 3, 0 7 Multiple Genome Alignment Assume we have a set of genomes G,..., G t that we want to align with each other. If they are short and very closely related,

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

Whole Genome based Phylogeny

Whole Genome based Phylogeny Whole Genome based Phylogeny Johanne Ahrenfeldt PhD student DTU Bioinformatics Short about me Johanne Ahrenfeldt johah@dtu.dk PhD student at DTU Bioinformatics Whole Genome based Phylogeny Graduate Engineer

More information

Chapter 19. Microbial Taxonomy

Chapter 19. Microbial Taxonomy Chapter 19 Microbial Taxonomy 12-17-2008 Taxonomy science of biological classification consists of three separate but interrelated parts classification arrangement of organisms into groups (taxa; s.,taxon)

More information

Microbial Taxonomy and the Evolution of Diversity

Microbial Taxonomy and the Evolution of Diversity 19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature

More information

Genetic Basis of Variation in Bacteria

Genetic Basis of Variation in Bacteria Mechanisms of Infectious Disease Fall 2009 Genetics I Jonathan Dworkin, PhD Department of Microbiology jonathan.dworkin@columbia.edu Genetic Basis of Variation in Bacteria I. Organization of genetic material

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)

BMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven) BMI/CS 776 Lecture #20 Alignment of whole genomes Colin Dewey (with slides adapted from those by Mark Craven) 2007.03.29 1 Multiple whole genome alignment Input set of whole genome sequences genomes diverged

More information

UNCORRECTED PROOF. AUTHOR'S PROOF Microb Ecol DOI /s

UNCORRECTED PROOF. AUTHOR'S PROOF Microb Ecol DOI /s Microb Ecol DOI 10.1007/s00248-011-9880-1 1 3 MINIREVIEWS 2 4 The Salmonella enterica Pan-genome 5 Annika Jacobsen & Rene S. Hendriksen & 6 Frank M. Aaresturp & David W. Ussery & Carsten Friis Q2 Q3 7

More information

By Eliza Bielak Bacterial Genomics and Epidemiology, DTU-Food Supervised by Henrik Hasman, PhD

By Eliza Bielak Bacterial Genomics and Epidemiology, DTU-Food Supervised by Henrik Hasman, PhD By Eliza Bielak Bacterial Genomics and Epidemiology, DTU-Food elibi@food.dtu.dk Supervised by Henrik Hasman, PhD 1. Introduction to plasmid biology 2. Plasmid encoded resistance to β- lactams (basic theories)

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

a-dB. Code assigned:

a-dB. Code assigned: This form should be used for all taxonomic proposals. Please complete all those modules that are applicable (and then delete the unwanted sections). For guidance, see the notes written in blue and the

More information

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr Introduction to Bioinformatics Shifra Ben-Dor Irit Orr Lecture Outline: Technical Course Items Introduction to Bioinformatics Introduction to Databases This week and next week What is bioinformatics? A

More information

The Evolution of Infectious Disease

The Evolution of Infectious Disease The Evolution of Infectious Disease Why are some bacteria pathogenic to humans while other (closely-related) bacteria are not? This question can be approached from two directions: 1.From the point of view

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Icm/Dot secretion system region I in 41 Legionella species.

Nature Genetics: doi: /ng Supplementary Figure 1. Icm/Dot secretion system region I in 41 Legionella species. Supplementary Figure 1 Icm/Dot secretion system region I in 41 Legionella species. Homologs of the effector-coding gene lega15 (orange) were found within Icm/Dot region I in 13 Legionella species. In four

More information

Introduction to Evolutionary Concepts

Introduction to Evolutionary Concepts Introduction to Evolutionary Concepts and VMD/MultiSeq - Part I Zaida (Zan) Luthey-Schulten Dept. Chemistry, Beckman Institute, Biophysics, Institute of Genomics Biology, & Physics NIH Workshop 2009 VMD/MultiSeq

More information

In order to compare the proteins of the phylogenomic matrix, we needed a similarity

In order to compare the proteins of the phylogenomic matrix, we needed a similarity Similarity Matrix Generation In order to compare the proteins of the phylogenomic matrix, we needed a similarity measure. Hamming distances between phylogenetic profiles require the use of thresholds for

More information

Bioinformatics Exercises

Bioinformatics Exercises Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted

More information

CONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018

CONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018 CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of

More information

no.1 Raya Ayman Anas Abu-Humaidan

no.1 Raya Ayman Anas Abu-Humaidan no.1 Raya Ayman Anas Abu-Humaidan Introduction to microbiology Let's start! As you might have concluded, microbiology is the study of all organisms that are too small to be seen with the naked eye, Ex:

More information

Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

More information

Title ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses

Title ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Title ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses

More information

Bio 119 Bacterial Genomics 6/26/10

Bio 119 Bacterial Genomics 6/26/10 BACTERIAL GENOMICS Reading in BOM-12: Sec. 11.1 Genetic Map of the E. coli Chromosome p. 279 Sec. 13.2 Prokaryotic Genomes: Sizes and ORF Contents p. 344 Sec. 13.3 Prokaryotic Genomes: Bioinformatic Analysis

More information

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010 BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for

More information

Salmonella Serotyping

Salmonella Serotyping Salmonella Serotyping Patricia Fields National Salmonella Reference Lab CDC 10 th Annual PulseNet Update Meeting April 5, 2006 What is Salmonella serotyping? The first-generation subtyping method Established

More information

Basic Local Alignment Search Tool

Basic Local Alignment Search Tool Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses

More information

HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS

HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS OVERVIEW INTRODUCTION MECHANISMS OF HGT IDENTIFICATION TECHNIQUES EXAMPLES - Wolbachia pipientis - Fungus - Plants - Drosophila ananassae

More information

Microbiome: 16S rrna Sequencing 3/30/2018

Microbiome: 16S rrna Sequencing 3/30/2018 Microbiome: 16S rrna Sequencing 3/30/2018 Skills from Previous Lectures Central Dogma of Biology Lecture 3: Genetics and Genomics Lecture 4: Microarrays Lecture 12: ChIP-Seq Phylogenetics Lecture 13: Phylogenetics

More information

Stepping stones towards a new electronic prokaryotic taxonomy. The ultimate goal in taxonomy. Pragmatic towards diagnostics

Stepping stones towards a new electronic prokaryotic taxonomy. The ultimate goal in taxonomy. Pragmatic towards diagnostics Stepping stones towards a new electronic prokaryotic taxonomy - MLSA - Dirk Gevers Different needs for taxonomy Describe bio-diversity Understand evolution of life Epidemiology Diagnostics Biosafety...

More information

CSCE555 Bioinformatics. Protein Function Annotation

CSCE555 Bioinformatics. Protein Function Annotation CSCE555 Bioinformatics Protein Function Annotation Why we need to do function annotation? Fig from: Network-based prediction of protein function. Molecular Systems Biology 3:88. 2007 What s function? The

More information

doi: / _25

doi: / _25 Boc, A., P. Legendre and V. Makarenkov. 2013. An efficient algorithm for the detection and classification of horizontal gene transfer events and identification of mosaic genes. Pp. 253-260 in: B. Lausen,

More information

Revisiting the Central Dogma The role of Small RNA in Bacteria

Revisiting the Central Dogma The role of Small RNA in Bacteria Graduate Student Seminar Revisiting the Central Dogma The role of Small RNA in Bacteria The Chinese University of Hong Kong Supervisor : Prof. Margaret Ip Faculty of Medicine Student : Helen Ma (PhD student)

More information

Multivariate analysis of genetic data: an introduction

Multivariate analysis of genetic data: an introduction Multivariate analysis of genetic data: an introduction Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London XXIV Simposio Internacional De Estadística Bogotá, 25th July

More information

Whole Genome Alignments and Synteny Maps

Whole Genome Alignments and Synteny Maps Whole Genome Alignments and Synteny Maps IINTRODUCTION It was not until closely related organism genomes have been sequenced that people start to think about aligning genomes and chromosomes instead of

More information

BIOLOGY STANDARDS BASED RUBRIC

BIOLOGY STANDARDS BASED RUBRIC BIOLOGY STANDARDS BASED RUBRIC STUDENTS WILL UNDERSTAND THAT THE FUNDAMENTAL PROCESSES OF ALL LIVING THINGS DEPEND ON A VARIETY OF SPECIALIZED CELL STRUCTURES AND CHEMICAL PROCESSES. First Semester Benchmarks:

More information

a-fB. Code assigned:

a-fB. Code assigned: This form should be used for all taxonomic proposals. Please complete all those modules that are applicable (and then delete the unwanted sections). For guidance, see the notes written in blue and the

More information

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES HOW CAN BIOINFORMATICS BE USED AS A TOOL TO DETERMINE EVOLUTIONARY RELATIONSHPS AND TO BETTER UNDERSTAND PROTEIN HERITAGE?

More information

Using Ensembles of Hidden Markov Models for Grand Challenges in Bioinformatics

Using Ensembles of Hidden Markov Models for Grand Challenges in Bioinformatics Using Ensembles of Hidden Markov Models for Grand Challenges in Bioinformatics Tandy Warnow Founder Professor of Engineering The University of Illinois at Urbana-Champaign http://tandy.cs.illinois.edu

More information

Database and Comparative Identification of Prophages

Database and Comparative Identification of Prophages Database and Comparative Identification of Prophages K.V. Srividhya 1, Geeta V Rao 1, Raghavenderan L 1, Preeti Mehta 1, Jaime Prilusky 2, Sankarnarayanan Manicka 1, Joel L. Sussman 3, and S Krishnaswamy

More information

GenomeTrakr: Data Submission and Analysis

GenomeTrakr: Data Submission and Analysis GenomeTrakr: Data Submission and Analysis Ruth Timme and Hugh Rand Center for Food Safety and Applied Nutrition U.S. Food Drug Administration IFSH Whole Genome Sequencing for Food Safety Symposium SEPTEMBER

More information

Hiromi Nishida. 1. Introduction. 2. Materials and Methods

Hiromi Nishida. 1. Introduction. 2. Materials and Methods Evolutionary Biology Volume 212, Article ID 342482, 5 pages doi:1.1155/212/342482 Research Article Comparative Analyses of Base Compositions, DNA Sizes, and Dinucleotide Frequency Profiles in Archaeal

More information

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology

More information

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi) Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction Lesser Tenrec (Echinops telfairi) Goals: 1. Use phylogenetic experimental design theory to select optimal taxa to

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

Phylogenomics, Multiple Sequence Alignment, and Metagenomics. Tandy Warnow University of Illinois at Urbana-Champaign

Phylogenomics, Multiple Sequence Alignment, and Metagenomics. Tandy Warnow University of Illinois at Urbana-Champaign Phylogenomics, Multiple Sequence Alignment, and Metagenomics Tandy Warnow University of Illinois at Urbana-Champaign Phylogeny (evolutionary tree) Orangutan Gorilla Chimpanzee Human From the Tree of the

More information

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand

More information

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of

More information

Microbiology BIOL 202 Lecture Course Outcome Guide (COG) Approved 22 MARCH 2012 Pg.1

Microbiology BIOL 202 Lecture Course Outcome Guide (COG) Approved 22 MARCH 2012 Pg.1 Microbiology BIOL 202 Lecture Course Outcome Guide (COG) Approved 22 MARCH 2012 Pg.1 Course: Credits: 3 Instructor: Course Description: Concepts and Issues 1. Microbial Ecology including mineral cycles.

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

Chapters AP Biology Objectives. Objectives: You should know...

Chapters AP Biology Objectives. Objectives: You should know... Objectives: You should know... Notes 1. Scientific evidence supports the idea that evolution has occurred in all species. 2. Scientific evidence supports the idea that evolution continues to occur. 3.

More information

Unsupervised Learning in Spectral Genome Analysis

Unsupervised Learning in Spectral Genome Analysis Unsupervised Learning in Spectral Genome Analysis Lutz Hamel 1, Neha Nahar 1, Maria S. Poptsova 2, Olga Zhaxybayeva 3, J. Peter Gogarten 2 1 Department of Computer Sciences and Statistics, University of

More information

A pathogen is an agent or microrganism that causes a disease in its host. Pathogens can be viruses, bacteria, fungi or protozoa.

A pathogen is an agent or microrganism that causes a disease in its host. Pathogens can be viruses, bacteria, fungi or protozoa. 1 A pathogen is an agent or microrganism that causes a disease in its host. Pathogens can be viruses, bacteria, fungi or protozoa. Protozoa are single celled eukaryotic organisms. Some protozoa are pathogens.

More information

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting. Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction

More information

BIOINFORMATICS LAB AP BIOLOGY

BIOINFORMATICS LAB AP BIOLOGY BIOINFORMATICS LAB AP BIOLOGY Bioinformatics is the science of collecting and analyzing complex biological data. Bioinformatics combines computer science, statistics and biology to allow scientists to

More information

Bacterial Communities in Women with Bacterial Vaginosis: High Resolution Phylogenetic Analyses Reveal Relationships of Microbiota to Clinical Criteria

Bacterial Communities in Women with Bacterial Vaginosis: High Resolution Phylogenetic Analyses Reveal Relationships of Microbiota to Clinical Criteria Bacterial Communities in Women with Bacterial Vaginosis: High Resolution Phylogenetic Analyses Reveal Relationships of Microbiota to Clinical Criteria Seminar presentation Pierre Barbera Supervised by:

More information

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB I519 Introduction to Bioinformatics, 2011 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha Plasmids 1. Extrachromosomal DNA, usually circular-parasite 2. Usually encode ancillary

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

This document describes the process by which operons are predicted for genes within the BioHealthBase database.

This document describes the process by which operons are predicted for genes within the BioHealthBase database. 1. Purpose This document describes the process by which operons are predicted for genes within the BioHealthBase database. 2. Methods Description An operon is a coexpressed set of genes, transcribed onto

More information

Bioinformatics and BLAST

Bioinformatics and BLAST Bioinformatics and BLAST Overview Recap of last time Similarity discussion Algorithms: Needleman-Wunsch Smith-Waterman BLAST Implementation issues and current research Recap from Last Time Genome consists

More information

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Toni Gabaldón Contact: tgabaldon@crg.es Group website: http://gabaldonlab.crg.es Science blog: http://treevolution.blogspot.com

More information

Microbial analysis with STAMP

Microbial analysis with STAMP Microbial analysis with STAMP Conor Meehan cmeehan@itg.be A quick aside on who I am Tangents already! Who I am A postdoc at the Institute of Tropical Medicine in Antwerp, Belgium Mycobacteria evolution

More information

Supplementary Information

Supplementary Information Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively

More information

CISC 636 Computational Biology & Bioinformatics (Fall 2016)

CISC 636 Computational Biology & Bioinformatics (Fall 2016) CISC 636 Computational Biology & Bioinformatics (Fall 2016) Predicting Protein-Protein Interactions CISC636, F16, Lec22, Liao 1 Background Proteins do not function as isolated entities. Protein-Protein

More information

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject)

Sara C. Madeira. Universidade da Beira Interior. (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) Bioinformática Sequence Alignment Pairwise Sequence Alignment Universidade da Beira Interior (Thanks to Ana Teresa Freitas, IST for useful resources on this subject) 1 16/3/29 & 23/3/29 27/4/29 Outline

More information

Microbes and you ON THE LATEST HUMAN MICROBIOME DISCOVERIES, COMPUTATIONAL QUESTIONS AND SOME SOLUTIONS. Elizabeth Tseng

Microbes and you ON THE LATEST HUMAN MICROBIOME DISCOVERIES, COMPUTATIONAL QUESTIONS AND SOME SOLUTIONS. Elizabeth Tseng Microbes and you ON THE LATEST HUMAN MICROBIOME DISCOVERIES, COMPUTATIONAL QUESTIONS AND SOME SOLUTIONS Elizabeth Tseng Dept. of CSE, University of Washington Johanna Lampe Lab, Fred Hutchinson Cancer

More information

Intro to Prokaryotes Lecture 1 Spring 2014

Intro to Prokaryotes Lecture 1 Spring 2014 Intro to Prokaryotes Lecture 1 Spring 2014 Meet the Prokaryotes 1 Meet the Prokaryotes 2 Meet the Prokaryotes 3 Why study prokaryotes? Deep Time 4 Fig. 25.7 Fossilized stromatolite (above) and living stromatolite

More information

Multiple Alignment of Genomic Sequences

Multiple Alignment of Genomic Sequences Ross Metzger June 4, 2004 Biochemistry 218 Multiple Alignment of Genomic Sequences Genomic sequence is currently available from ENTREZ for more than 40 eukaryotic and 157 prokaryotic organisms. As part

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Sequencing alignment Ameer Effat M. Elfarash

Sequencing alignment Ameer Effat M. Elfarash Sequencing alignment Ameer Effat M. Elfarash Dept. of Genetics Fac. of Agriculture, Assiut Univ. aelfarash@aun.edu.eg Why perform a multiple sequence alignment? MSAs are at the heart of comparative genomics

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

Bioinformatics Chapter 1. Introduction

Bioinformatics Chapter 1. Introduction Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!

More information

SUPPLEMENTARY TABLES Table S1 Gene Primer Sequence Position (Size)

SUPPLEMENTARY TABLES Table S1 Gene Primer Sequence Position (Size) SUPPLEMENTARY TABLES Table S1 Primers used for gdna and mrna analyses Gene Primer Sequence Position (Size) mga F: 5 -TGGTTATATTACAATCTGGTACCATC 198-405 (208) R: 5 -GGTATGCGTCAAATAGGCATTGG emm23 F: 5'-GCTTTGACAGTTTTAGGGACAGG

More information

Figure A1. Phylogenetic trees based on concatenated sequences of eight MLST loci. Phylogenetic trees were constructed based on concatenated sequences

Figure A1. Phylogenetic trees based on concatenated sequences of eight MLST loci. Phylogenetic trees were constructed based on concatenated sequences A. B. Figure A1. Phylogenetic trees based on concatenated sequences of eight MLST loci. Phylogenetic trees were constructed based on concatenated sequences of eight housekeeping loci for 12 unique STs

More information