Master Biomedizin ) UCSC & UniProt 2) Homology 3) MSA 4) Phylogeny. Pablo Mier

Similar documents
Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species

Comparing Genomes! Homologies and Families! Sequence Alignments!

An Evolutionary Trend Discovery Algorithm Based on Cubic Spline Interpolation

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

Supplemental Figure 1.

Purpose. Process. Students will locate the species listed below on the infographic and write down the domain to which they belong:

Supporting Online Material for

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Multiple Sequence Alignments

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny

Molecular Coevolution of the Vertebrate Cytochrome c 1 and Rieske Iron Sulfur Protein in the Cytochrome bc 1 Complex

Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors

1/27/2010. Systematics and Phylogenetics of the. An Introduction. Taxonomy and Systematics

Hands-On Nine The PAX6 Gene and Protein

Application of new distance matrix to phylogenetic tree construction

Bioinformatics Report Branchiostoma lanceolatum dopamine D 1 / receptor protein phylogenetic analysis. Alanna Lewis

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Biased amino acid composition in warm-blooded animals

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Introduction to protein alignments

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Introduction to Bioinformatics

Concept Modern Taxonomy reflects evolutionary history.

CSE 427 Comp Bio. Sequence Alignment

BIOINFORMATICS LAB AP BIOLOGY

Camello, a novel family of Histone Acetyltransferases that acetylate histone H4 and is essential for zebrafish development

CONSTRUCTION OF PHYLOGENETIC TREE FROM MULTIPLE GENE TREES USING PRINCIPAL COMPONENT ANALYSIS

Multiple Sequence Alignment. Sequences

Chapter 26: Phylogeny and the Tree of Life

To determine the thresholds for the number of hits required to assign a protein to a


Quantitative and qualitative analyses. of in-paralogs

The MANTiS Manual. Contents. MANTiS Version 1.1

Phylogeny and the Tree of Life

CSE P 590 A Spring : MLE, EM

CSE 427 Computational Biology Winter Sequence Alignment. Sequence Alignment. Sequence Similarity: What

CRISPRseek Workshop Design of target-specific guide RNAs in CRISPR-Cas9 genome-editing systems

Why do we classify things? Supermarket aisles Libraries Classes Teams/sports Members of a family Roads Cities Money

Evolution and Taxonomy Laboratory

Gene mention normalization in full texts using GNAT and LINNAEUS

Exploring evolution of brain genes involved in microcephaly through phylogeny and synteny analysis

Characteristics of Life

A Browser for Pig Genome Data

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Phylogeny and the Tree of Life

Vertebrate genome sequencing: building a backbone for comparative genomics

Rapid birth-and-death evolution of the xenobiotic metabolizing NAT gene family in vertebrates with evidence of adaptive selection

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

19/10/2015. Response to environmental manipulation. Its not as simple as it seems!

7 th Grade SCIENCE FINAL REVIEW Ecology, Evolution, Classification

Example of Function Prediction

1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS BIOINFORMATICS COURSE MTAT

Emily Blanton Phylogeny Lab Report May 2009

CSEP 527 Computational Biology Spring Lecture 2 Sequence Alignment

Biology Classification Unit 11. CLASSIFICATION: process of dividing organisms into groups with similar characteristics

Lesson Plan Taxonomy and Classiicaion

Chapter 26 Phylogeny and the Tree of Life

ENCODE DCC Antibody Validation Document

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

mosaic: Supplementary material

SOL REVIEW cell structure, classification, genetic identity, and place in a food web.

The Life System and Environmental & Evolutionary Biology II Lecture 3 - Biodiversity

Computational Structural Bioinformatics

Status of Living and Extinct Taxa. Mammals Amphibians Birds. Nearly 1/3 (31%) is globally threatened or extinct. DRAFT

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Bioinformatics Exercises

Genome Sequencing & DNA Sequence Analysis

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Cladistics and Bioinformatics Questions 2013

11/24/13. Science, then, and now. Computational Structural Bioinformatics. Learning curve. ECS129 Instructor: Patrice Koehl

Phylogeny: building the tree of life

Combination of X-ray crystallography, SAXS and DEER to obtain the structure of the FnIII-3,4 domains of integrin α6β4

Evidence for a common evolutionary rate in metazoan transcriptional networks

In silico approach to evaluate evolutionary relatedness of significant Lung cancer proteins

Phylogeny and Systematics

Evolution and Diversification of Life

Heuristic Methods. Heuristic methods for alignment Sequence databases Multiple alignment Gene and protein prediction

HUMAN EVOLUTION. Where did we come from?

Inparanoid: a comprehensive database of eukaryotic orthologs

thebiotutor.com AS Biology Unit 2 Classification, Adaptation & Biodiversity

Announcements: 1. Labs meet this week 2. Lab manuals have been ordered 3. Some slides from each lecture will be on the web 4. Study questions will be

Many of the slides that I ll use have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

Phylogeny and the Tree of Life

1bcc. Lichtarge lab Evolutionary trace report by report maker May 29, 2010 CONTENTS

Taxonomy. Taxonomy is the science of classifying organisms. It has two main purposes: to identify organisms to represent relationships among organisms

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Synonymous Codon Substitution Matrices

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

1 ATGGGTCTC 2 ATGAGTCTC

Evolution and Diversification of Life. Origin and Evolution of. Life. OCN 201 Science of the Sea Biology Lecture 2. The Handfish -BBC Blue Planet

Homolog. Orthologue. Comparative Genomics. Paralog. What is Comparative Genomics. What is Comparative Genomics

CHAPTER 26 PHYLOGENY AND THE TREE OF LIFE Connecting Classification to Phylogeny

Unit 5: Cell Division and Development Guided Reading Questions (45 pts total)

Possible identification of CENP-C in fish and the presence of the CENP-C motif in M18BP1 of vertebrates. [version 2; referees: 3 approved]

Phylogenetic tree construction based on amino acid composition and nucleotide content of complete vertebrate mitochondrial genomes

Ch. 9 Multiple Sequence Alignment (MSA)

BIOLOGY. Phylogeny and the Tree of Life CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson

Improving Hox Protein Classification across the Major Model Organisms

Quantitative Measurement of Genome-wide Protein Domain Co-occurrence of Transcription Factors

Transcription:

Master Biomedizin 2018 1) UCSC & UniProt 2) Homology 3) MSA 4) 1

12 a. All of the sequences in file1.fasta (https://cbdm.uni-mainz.de/mb18/) are homologs. How many groups of orthologs would you say there are in this file? Use Trex (http://www.trex.uqam.ca/). Two groups of orthologs: Protein A & protein B. b. What could you say about the history of this protein family? E.coli has only one protein, and then it duplicated to form A and B. It is possible that X.laevis_B duplicated later to form B and C. c. Would you say there is any wrongly annotated sequence? X.tropicalis_B is wrongly annotated. It should be X.tropicalis_A, because they are in the same branch. The actual X.tropicalis_B either is not in the dataset or was lost during evolution. 2

*Images from: Trex 12 a. Two groups of orthologs: Proteins A Proteins B b. E.coli has only one protein, and then it duplicated to form A and B. It is possible that X.laevis_B duplicated later to form B and C. c. X.tropicalis_B is wrongly annotated. It should be X.tropicalis_A, because they are in the same branch. The actual X.tropicalis_B either is not in the dataset or was lost during evolution. 3

13 a. Using file2.fasta (https://cbdm.uni-mainz.de/mb18/) and Trex (http://www.trex.uqam.ca/), can you approximate to which taxonomic division belongs proteinx? Primates. b. From which organism could it be? After guessing, check it. Homo sapiens (human) or Pan troglodytes (chimpanzee); they are 100% identical. 4

*Images from: Trex, UniProt 13 Drosophila melanogaster (Fruit fly) Xenopus laevis (frog) Rattus norvegicus (rat) Mus musculus (mouse) Homo sapiens (human) a. Primates Pongo abelii (orangutan) Sus scrofa (pig) Bos taurus (cattle) b. Homo sapiens (human) or Pan troglodytes (chimpanzee); they are 100% identical. 5

14 Human hemoglobin consists of four protein subunits: two from the alpha globin gene cluster (located on chromosome 16) and two more from the beta globin gene cluster (located on chromosome 11). But there are at least nine different globin genes in these clusters, which are: zeta, mu, alpha, theta1, epsilon, gamma1, gamma2, delta and beta. Use the proteins in file3.fasta (https://cbdm.uni-mainz.de/mb18/). a. Sort them either in cluster alpha or cluster beta. Alpha: zeta, mu, alpha, theta1. Beta: epsilon, gamma1, gamma2, delta and beta. b. Why do you think they are clustered in either cluster alpha or cluster beta? Paralogous expansion from one ancestral alpha and one ancestral beta. 6

*Images from: UniProt, Trex, UCSC 14 a. Using Trex: Cluster alpha Cluster beta b. Paralogous expansion from one ancestral alpha and one ancestral beta. 7

*Images from: Wikipedia 15 Consider the following sequences: Q9T4B2 (CYB_DELDE, dolphin), P00156 (CYB_HUMAN, human), P00157 (CYB_BOVIN, cattle), Q9MIX8 (CYB_DANRE, zebra fish), P18946 (CYB_CHICK, chicken), Q36461 (CYB_ORNAN, platypus) and P00160 (CYB_XENLA, frog). Do you think that a phylogenetic tree built using these sequences would be coherent with evolution? Check with the taxonomy provided by UniProt (http://www.uniprot.org/taxonomy/). Human (Homo sapiens) Dolphin (Delphinus delphis) Cattle (Bos taurus) Chicken (rooster) (Gallus gallus) 8 Zebra fish (Danio rerio) Platypus (Ornithorhynchus anatinus) Frog (Xenopus laevis)

*Images from: UniProt, Trex 15 Dolphin Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Mammalia Eutheria Laurasiatheria Cetartiodactyla Cetacea Odontoceti Delphinidae Delphinus Cattle Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Mammalia Eutheria Laurasiatheria Cetartiodactyla Ruminantia Pecora Bovidae Bovinae Bos Human Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Mammalia Eutheria Euarchontoglires Primates Haplorrhini Catarrhini Hominidae Homo Platypus Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Mammalia Monotremata Ornithorhynchidae Ornithorhynchus Frog Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Amphibia Batrachia Anura Pipoidea Pipidae Xenopodinae Xenopus Xenopus Chicken Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Archelosauria Archosauria Dinosauria Saurischia Theropoda Coelurosauria Aves > Gallus Zebrafish Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Actinopterygii Neopterygii Teleostei Ostariophysi Cypriniformes Cyprinidae Danio 9

MSA + + Homology 16 Consider the following sequences: P50225, P50226, P0DMM9, P0DMN0, O43704, O00338, Q6IMI6, O75897. All of them are part of a human paralog family (sulfotransferases). a. Look for the coordinates of the substrate binding region in the UniProt entry P50225. Do you expect all the paralogs to share this substrate binding region? Why? Check it. Yes, because they share the motif K[ST]H. b. Using the previous sequences, could you determine which human paralog is the ortholog of the sequence in file4.fasta (https://cbdm.uni-mainz.de/mb18/)? Do not use BLAST. The ortholog of protein_of_interest is ST1B1_HUMAN. c. Which protein is the one found in file4.fasta? ST1B1_CANLF, from Canis familiaris (dog). 10

MSA + + Homology 16 *Images from: UniProt, Trex (P50225) a. Yes, because they share the motif K[ST]H. b. The ortholog of protein_of_interest is ST1B1_HUMAN. c. ST1B1_CANLF, from Canis familiaris (dog). 11