Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

Similar documents
Investigating Evolutionary Questions Using Online Molecular Databases *

Using Bioinformatics to Study Evolutionary Relationships Instructions

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Open a Word document to record answers to any italicized questions. You will the final document to me at

BIOINFORMATICS LAB AP BIOLOGY

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Tree Building Activity

PHYLOGENY & THE TREE OF LIFE

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

Hands-On Nine The PAX6 Gene and Protein

Chapter 19: Taxonomy, Systematics, and Phylogeny

COMPARING DNA SEQUENCES TO UNDERSTAND EVOLUTIONARY RELATIONSHIPS WITH BLAST

Bioinformatics Exercises

Piecing It Together. 1) The envelope contains puzzle pieces for 5 vertebrate embryos in 3 different stages of

Chapter 26 Phylogeny and the Tree of Life

Skulls & Evolution. Procedure In this lab, groups at the same table will work together.

b. In Table 1 (question #2 on the Answer Sheet describe the function of each set of bones and answer the question.)

GENERAL BIOLOGY LABORATORY EXERCISE Amino Acid Sequence Analysis of Cytochrome C in Bacteria and Eukarya Using Bioinformatics

Homology. and. Information Gathering and Domain Annotation for Proteins

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Topics. Antibiotic resistance, changing environment LITERACY MATHEMATICS. Traits, variation, population MATHEMATICS

Classification and Phylogeny

Introduction to Bioinformatics Online Course: IBT

Classification and Phylogeny

CLASSIFICATION OF LIVING THINGS. Chapter 18

Organizing Life s Diversity

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

I. HOMOLOGOUS STRUCTURES

Classification, Phylogeny yand Evolutionary History

Evidence of Evolution

Station 1: Evidence from Current Examples

8/23/2014. Phylogeny and the Tree of Life

Biologists have used many approaches to estimating the evolutionary history of organisms and using that history to construct classifications.

Phylogeny and the Tree of Life

BIOINFORMATICS: An Introduction

Experiment 0 ~ Introduction to Statistics and Excel Tutorial. Introduction to Statistics, Error and Measurement

1st Grade. Similarities. Slide 1 / 105 Slide 2 / 105. Slide 4 / 105. Slide 3 / 105. Slide 5 / 105. Slide 6 / 105. Inheritance of Traits

OECD QSAR Toolbox v.4.1. Tutorial illustrating new options for grouping with metabolism

1st Grade. Similarities. Slide 1 / 105 Slide 2 / 105. Slide 4 / 105. Slide 3 / 105. Slide 5 / 105. Slide 6 / 105. Inheritance of Traits

EVIDENCE OF EVOLUTION

Introduction to protein alignments

Student Handout Fruit Fly Ethomics & Genomics

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

DNA, Chromosomes, and Genes

This course covers mammals (as loosely defined above). To classify the cheetah, we would do the following:

4. In light of evolution do individuals evolve or do populations evolve? Explain your answer.

Emily Blanton Phylogeny Lab Report May 2009

Classification Revision Pack (B2)

Evidence for Evolution

Ch. 9 Multiple Sequence Alignment (MSA)

Macroevolution Part I: Phylogenies

10 Biodiversity Support. AQA Biology. Biodiversity. Specification reference. Learning objectives. Introduction. Background

Chapter 16: Reconstructing and Using Phylogenies

Reading for Lecture 13 Release v10

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Classification. 18a. Lab Exercise. Contents. Introduction. Objectives. 18a

ABE Math Review Package

16.4 Evidence of Evolution

How to Use This Presentation

Name: Class: Date: ID: A

AP Biology Notes Outline Enduring Understanding 1.B. Big Idea 1: The process of evolution drives the diversity and unity of life.

OECD QSAR Toolbox v.3.4. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

Evidence of Evolution

Homology and Information Gathering and Domain Annotation for Proteins

How should we organize the diversity of animal life?

Chemistry 14CL. Worksheet for the Molecular Modeling Workshop. (Revised FULL Version 2012 J.W. Pang) (Modified A. A. Russell)

Draft document version 0.6; ClustalX version 2.1(PC), (Mac); NJplot version 2.3; 3/26/2012

Section Review. Change Over Time UNDERSTANDING CONCEPTS. of evolution? share ancestors? CRITICAL THINKING

Journal of Proteomics & Bioinformatics - Open Access

Evidence of Species Change

Evidence of Evolution (PAP)

NGSS Example Bundles. 1 of 15

Comparing whole genomes

OECD QSAR Toolbox v.3.2. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

Lab 6 Cell Division, Mitosis, and Meiosis

Phylogeny & Systematics: The Tree of Life

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build and evaluate a category based on mechanism of action with protein and DNA binding

Exercise 3 Exploring Fitness and Population Change under Selection

Evidence of Common Ancestry Stations

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

Session 5: Phylogenomics

ISIS/Draw "Quick Start"

Name Date Class. This section tells about the characteristics of birds, how they care for their young, and about their special adaptations.

SENSITIVITY AND ELASTICITY ANALYSES

Evidence: Table 1: Group Forkbird Population Data 1-Tined Forkbirds 2-Tined Forkbirds 4-Tined Forkbirds Initial

Evidence for Evolution

Review sheet for Mendelian genetics through human evolution. What organism did Mendel study? What characteristics of this organism did he examine?

Diversity in Living Organism

Experiment: Oscillations of a Mass on a Spring

Star Cluster Photometry and the H-R Diagram

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Life Science Curriculum Sixth Grade

Evidence of Evolution Background

The Hückel Approximation Consider a conjugated molecule i.e. a molecule with alternating double and single bonds, as shown in Figure 1.

Section III - Designing Models for 3D Printing

Natural Selection. Differential survival and reproduction has been demonstrated in the wild and in the laboratory many times.

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

Transcription:

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution Background How does an evolutionary biologist decide how closely related two different species are? The simplest way is to compare physical features of the species (as we did with the wild and domestic canines.) We generally expect that brothers and sisters will look more similar to each other than two cousins might. If you make a family tree, you find that brothers and sisters share a common parent, but you most look harder at the tree to find which ancestor the two cousins shore. Cousins do not share the same parents; rather, they share some of the same grandparents. In other words, the common ancestor of two brothers is more recent (their parents) than the common ancestor of two cousins (their grandparents), and in evolutionary sense, this is why we say that two brothers are more closely related than two cousins. Similarly, evolutionary biologists might compare salamanders and frogs and salamanders and fish. More physical features are shared between frogs and salamanders than between frogs and fish, and an evolutionary biologists might use this information to infer that frogs and salamanders had a more recent common ancestor than did frogs and fish. But, no process is without problems. Two very similar looking people are not necessarily related, and two species that have similar features also may not be closely related. Comparing morphology can also be difficult if it is hard to find sufficient morphological characteristics to compare. Remember the problems with the canids! Imagine that you were responsible for determining which two of three salamander species were most closely related. What physical features would you compare? When you ran out of physical features, is there anything else you could compare? Many biologists turn next to comparing genes and proteins. Genes and proteins are not necessarily better than morphological features except in the sense that differences in morphology can be a result of environmental conditions rather than genetics, and differences in genes are definitely genetic. Also, there are sometimes more molecules to compare than physical features. There are three exercises that follow in which you will use protein databases that are in the public domain. You will be able to investigate gene products (which are proteins) and evaluate evolutionary relationships. The protein you will work with is the hemoglobin beta chain. You will obtain your data from public online databases that contains the amino-acid sequences of proteins coded for by many different organisms. Hemoglobin, the molecule that carries oxygen in our bloodstream, is composed of four subunits. In adult hemoglobin, two of these subunits are identical and coded for by the alpha hemoglobin gene located on 16 chromosome. The other two are identical and coded for by the beta-hemoglobin gene found on the 11 chromosome. We will use the protein sequences of the beta hemoglobin as a set of traits to compare among species. Part I Are Bats Birds? 1

In Parts I and II, there are no hypotheses. Hence, you are not collecting data to determine the validity of your hypothesis. You are learning a new skill the mining of molecular information available to all on the internet. In Part III, you will use your skills to test a hypothesis. Then, you will need to determine the Experimental Design Questions. We ll discuss these in class. However, since Part III does have a hypothesis, you will collect data to determine the validity of your hypothesis. Procedure Part I To be completed as a team. Answer questions in spaces provided on Data Tables Table 1: Morphological comparison of birds, bats, and other mammals Feature Birds Bats Other mammals Presence of hair Presence of feathers Presence of mammary glands Presence of wings Homeothermy Four-chambered hearts 1a. What morphological features do bats share with mammals? 1b. Based on morphology, are bats more similar to birds or mammals? Use your internet access to complete Part I 2. Generate a distance matrix for the beta-hemoglobin chain for two birds species, two bat species\, and two non-bat mammal species into a word processing worksheet. Follow the steps below to do this. Step 1: Begin by going to www.uniprot.org Step 2. In the Search In dialogue box, use down arrow to select Protein Knowledgebase(UniProt). In the Query box, type hemoglobin beta. Click Search. Step 3. This will take you to the beginning of the database. http://www.uniprot.org/uniprot/?query=hemoglobin&sort=score Step 4. Use the right-hand scroll bar to scroll through the names of the many entries. Find one for either a bird, a bat, or some other mammal. When you find one, check to make sure that it is the hemoglobin beta chain (preferably one without a number after it) and not the alpha or gamma or other hemoglobin subunit. If the sequence is for the beta chain and it is for an appropriate species, click on it and the computer will retrieve the sequence. Step 5. Once you have selected your organisms, then hit Retreive at the bottom of the page. Step6. Once you are at the next page, you will see the UniProt Identifiers Step 7. Hit Align at the bottom of the page. Scroll down to see what was found in the Uniprot database. a. You will see Entry results. This is a list of those organisms for which you wish to compare the sequence of amino acids for hemoglobin beta protein. The Accession number is the UniProt Identifier. b. Second, scroll down to ClustalW results. This is an alignment of the two organisms you wish to compare hemoglobin beta protein molecule. The Uniprot Identifiers on the left and the sequences are listed and matched to the right. Each letter in the list represents one amino acid. Please see Appendix A for the list. Now, go back up to Entry results. 2

c. Scrolling down a bit, you will see a box in which may enter additional sequences in the box. Please note, that FASTA format is the acceptable format for this program. This program automatically places your requested sequences into this FASTA format. There is a manual way to do this as well. I will show you how to do this if you if you would like. d. Scroll down to Amino acid properties. You can select one or multiple properties of the amino acids and different colors will show up on the ClustalW results. e. Scroll down to Sequence annotation (Features) and you will see other characteristics of the protein that you can compare (e.g. location of Active Sites, a Helix turn, a Metal binding site, etc.) Step 8. Hit Start Jalview. Using the various menus at the top you can see some similarities among the two different protein molecules Step 8. Hit UniProtKB (#) WHERE??? Step 9. Align at bottom of page P02070 MLTAEEKAAVTAFWGKVKVDEVGGEALGRLLVVYPWTQRFFESFGDLSTADAVMNNPKVK 60 HBB_BOVIN P02075 MLTAEEKAAVTGFWGKVKVDEVGAEALGRLLVVYPWTQRFFEHFGDLSNADAVMNNPKVK 60 10. Check out ClustalI result and Start Jalview 11. Blast Search: http://www.uniprot.org/blast/ Add ONLY protein sequence to MLTAEEKAAVT GFWGKVKVDEVGAEALGRLLVVYPWTQRFFEHFGDLSNADAVMNNPKV K AHGKKVLDSFSNGMKHLDDLKGTFAQLSELHCDKLHVDPENFRLLGNVLVVVLARHHGNE FTPVLQADFQKVVAGVANALAHKYH MLTAEEKAAVT AFWGKVKVDEVGGEALGRLLVVYPWTQRFFESFGDLST ADAVMNNPKV K AHGKKVLDSFSNGMKHLDDLKGTFAALSELHCDKLHVDPENFKLLGNVLVVVLARNFGKE FTPVLQADFQKVVAGVANALAHRYH Hit Retrieve = identifiers in box Hit Blast = empty box. Add only protein sequence letters Hit Blast at right http://www.uniprot.org/blast/uniprot/qbt2 3

Step 7. The next screen contains lots of information. The protein sequence is near the bottom of the information sheet in the Sequence information section (see Figure E for example). Using the right-hand scroll bar, find the amino-acid sequence. The amino acids are indicated with their single-letter symbols (see Figure F for their full names; found on page 6) and every 10 th amino acid is marked with its position. Step 8. Here is a sample of that species information for the beta hemoglobin sequence for goldfish Figure E. P02140-1 [UniParc]. Last modified July 21, 1986. Version 1. Checksum: 32F6EA73A1D52497 FASTA 147 16,210 Blast go 10 20 30 40 50 60 VEWTDAERSA IIGLWGKLNP DELGPQALAR CLIVYPWTQR YFATFGNLSS PAAIMGNPKV 70 80 90 100 110 120 AAHGRTVMGG LERAIKNMDN IKATYAPLSV MHSEKLHVDP DNFRLLADCI TVCAAMKFGP 130 140 SGFNADVQEA WQKFLSVVVS ALCRQYH Step 9. Above the sequence click on the FASTA format. This will simply provide you with the condensed sequence for that species, along with the species identification. >sp P02140 HBB_CARAU Hemoglobin subunit beta OS=Carassius auratus GN=hbb PE=1 SV=1 VEWTDAERSAIIGLWGKLNPDELGPQALARCLIVYPWTQRYFATFGNLSSPAAIMGNPKV AAHGRTVMGGLERAIKNMDNIKATYAPLSVMHSEKLHVDPDNFRLLADCITVCAAMKFGP SGFNADVQEAWQKFLSVVVSALCRQYH Step 10. Use your mouse to select and copy the information. Start a new Word document, and paste the information into that Word document. Name the document and SAVE! Step 11. Repeat the above steps until your Word document sheet contains the FASTA formatted sequences for six different species: two bird species, two bat species, two non-bat mammals. Write the names of the species you have chosen into Table 2 Step 12. Save your Word document but do not close it. Step 13. To align the sequences and determine how similar they are, go to an internet alignment program, e.g. LALIGN at http://fasta.bioch.virginia.edu/fasta_www2/fasta_www.cgi?rm=lalign (Figure G). There are other alignment programs e.g. http://www.ch.embnet.org/software/lalign_form.html (change the default to global). Go ahead and play with them. 4

Step 14. Select, copy and paste one sequence from your Word document into the first sequence box and another sequence into the second sequence box as shown in Figure G. It is best to copy only the sequence and not any of the identification information. Keep track of what you entered. Figure G. LALIGN: Sequence Alignment Tool, with two sequences to compare Step 15. Click on Align Sequence. Be patient. Step 16. The computer will return a set of information including the percent identity in the 146 aa overlap (Figure H). Record that piece of information in the Table 3 grid. This value is essentially the percent of amino acids that are similar. If all the amino acids were the same, the percent would be 100%. Not only does LALIGN give you the percent similarity, it also shows you the actual alignment of the two sequences. Identical amino acids are marked with two dots between them (:) If there is one dot, the change in amino acid is conservative (both amino acids have similar properties and charge), and if there are no dots, then the two amino acids have different biochemical properties. Step 17. A distance matrix is a table that shows all the pairwise comparisons between species. Continue to make all pairwise comparisons until Table 3 is filled. For each comparison, use the percent identity for the overlap of all the 146 amino acids. Figure H. LALIGN: Sample Alignment Analysis Results 5

Step 18. Use Table 3 to answer the six questions which follow. Table 3 The distance matrix for Part I Bat 1 Bat 2 Bird 1 Bird 2 Mammal 1 Mammal 2 Bat 1 100% Bat 2 100% Bird 1 100% Bird 2 100% Mammal 1 100% Mammal 2 100% Shaded boxes are simply repeat data. 6

Figure A. Amino Acid Symbols 7

Procedure Part II Reptiles with Feathers? You may complete as a team. Some phylogenetic systematists (scientists who work to make the classification of organisms match their evolutionary history) complain that the vertebrate class Reptilia is improper because it should include birds. In technical terms, the vertebrate class Reptilia is paraphyletic because it contains some but not all of the species that arose from the most recent common ancestor to this group. Just how similar are reptiles and birds in terms of the beta-hemoglobin chain? Should birds be considered a type of reptile? Evaluate this question using a BLAST (Best Local Alignment Search Tool) search. A BLAST (Best Local Alignment Search Tool) search takes a particular sequence and then locates the most similar sequences in the entire database. A BLAST search will result in a list of sequences with the first sequence being most close to the one entered and the last sequence being least similar Step 1. Repeat steps in Procedure Part I Step 2. Hit BLAST on upper menu Step 3. Add FASTA formatted amino acids sequence of a specific organism to response box. Step 4. Click BLAST on right side of box. If you scroll down you will see information about % identity of the particular protein in other organisms. Table 6: Results of a BLAST search on the crocodile beta-hemoglobin sequence Similarity Species name & name of protein First most similar (do not use crocodile) Second most familiar Third most familiar Fourth most familiar Fifth most familiar Sixth most familiar Seventh most familiar Eighth most familiar Ninth most familiar Tenth most familiar 1. Were any of those species birds? 2. One unusual reptile is the tuatara, whose name is Sphenodon punctatus. How similar is the tuatara to the crocodile? 3. Does the tuatara appear in your list of ten? If not, how far down on the BLAST search list does it occur, fifteenth, twentieth? 4. Most importantly, which species are more similar to the crocodile? (birds, or other reptiles?) 5. Do the molecular data suggest that Reptilia is paraphyletic, or monophyletic? Explain. 8

Part III To be completed individually Purpose: To determine the relative phylogenetic proximity of the canid genus: grey wolf, domestic dog, red fox, jackal. Hypothesis: The beta hemoglobin protein sequences among the four canid species suggests a phylogenetic relationship among the four canid species. Using the tools from Parts I and II, suggest what you think are the evolutionary relationships among the four canids. Below is a suggestion as to how you can develop a hypothesis. A B C D Materials: Procedure: See Parts I and II above. Experimental Design Questions 1. Control/ 2. DV/IV? 3. Extraneous Factors? 4. Repeat Data? 5. What will be measured? Data: Develop data charts. Analysis What does your data show about the relationships among the four animals in question? Explain. Conclusion Do your data support or not support your hypothesis? Cite specific reference to data. Error Analysis: 9

10