Laboratory 11 Phylogenetics
Biology 171L SP18 Lab 11: Phylogenetics Student Learning Outcomes 1. Discover Darwin s contribution to biology. 2. Understand the importance of evolution in the study of biology. 3. Use phylogenetic analysis to reconstruct evolutionary relationships. Relevant Reading Campbell Biology, Chapter 26, especially pp. 549-559, and Chapter 24 A Short Guide to Writing about Biology, Chapters 5 and 6 Homework Synopsis (see page 11-11 for full description) Part I Mastering Biology Part II & III Science Communication/ Data Analysis Mastering Biology HOMEWORK INTRODUCTION Darwin and Evolution. Evolution is a central concept in biology. Evolution refers to the changes in the history of life occurring over the geological time scale and the mechanism for creating the diversity of life on Earth. Evidence for evolution was documented in the 1859 publication by Charles Darwin, On the Origin of Species by Means of Natural Selection. Although Darwin was not the first to propose that life evolved as the Earth evolved, he is credited with the related concept of natural selection. He proposed that evolution occurred by descent with modification (Darwin s term). Natural selection explains the present day diversity of life. Organisms that most favorably adapt to their environment will have greater reproductive success, ensuring the survival of species. Over vast spans of time, organisms accumulate enough variations in response to environmental pressures that they evolve into diverse organisms, distinct from their ancestors. Figure 1. Darwin s ideas about classifying organisms, as recorded in one of his many notebooks. In the Darwinian view, there is unity in life: all present-day organisms evolved from a common ancestor that lived in the past. For this reason, the history of life can be modeled by a tree. The branches on the tree represent divergences that explain the great diversity of life, but all the Biol 171L - SP18 Phylogenetics 11-2
branches arose from a common source (i.e. the common ancestor to all living organisms), as represented by the trunk of the tree. In fact, Darwin s early ideas about classifying organisms are drawn like a tree with many branches separating divergent organisms (Fig. 1). Phylogenetic Trees. A genealogical family tree is used to represent an ancestral lineage, and marks the progression of time (ancestors lived in the past, descendants live in the future). Just as a family tree represents familial relationships (such as between a great-great-grandmother and each of her descendants), a phylogeny (or phylogenic tree, or cladogram) represents evolutionary relationships (such as between dogs and wolves) over the history of the Earth. Phylogenies are drawn like family trees. Unfortunately, organisms do not arise in nature along with the evolutionary equivalent of a birth certificate documenting their evolutionary history. For this reason, evolutionary biologists must reconstruct evolutionary relationships. It is the role of the evolutionary biologist to unravel the historical clues of organisms. Biologists do this by studying historical evidence, such as biogeography (the geographical distribution of species) or the fossil record. They also make anatomical and embryological comparisons, and more recently, utilize genetic markers (for instance, to uncover changes to the hereditary material, or DNA). Evidence gathered from these different sources allows scientists to propose relationships between different organisms. Phylogenies are visual representations of the evolution of related species. For example, the genealogies of all present-day organisms can be illustrated in a phylogenetic tree (Fig. 2). All of life on Earth falls into one of the three domains of life: Bacteria, Archaea and Eukarya. Humans and wolves are grouped into the branch labeled animals in the domain Eukarya. Escherichia coli is classified in the phylum Proteobacteria in the Figure 2. Phylogenic tree of the three domains of life. domain Bacteria. This tree provides a working hypothesis about the relationship between the organisms listed, and has been supported by scientific evidence, but can be remodeled as new evidence emerges. Building phylogenies. To build a phylogenetic tree such as that shown in Figure 2, biologists must employ careful observation. Biologists collect data about the characters of organisms that they would like to place into a phylogeny. Characters are traits. They can be based on morphology (physical characteristics), genetics, or behavior. Notice that the selected characters Biol 171L - SP18 Phylogenetics 11-3
are based on measurable similarities and differences, rather than subjective traits. The purpose is to uncover clues that allow organisms to be grouped to unique branches, called clades (from the Greek clados, branch ), which become less inclusive over time. Clades are organized in a dichotomous tree called a cladogram. In order to group organisms into clades, biologists are interested in finding shared derived characters, also referred to as synapomorphies. These are characters that arose from a lineage leading up to a clade (i.e. derived ), and also in common with two distinct lineages (i.e. shared ). Clades that group shared derived characters are also considered monophyletic, since they include a common ancestor and all descendants (Fig. 3). In contrast, paraphyletic groupings exclude some descendants, even if they arose due to natural selection and share a common ancestor (Fig. 3). Figure 3. Different groupings of clades within phylogenetic trees. Phylogenetic analysis is a method of classification focused on creating monophyletic groupings as an approximation of phylogeny. This analysis begins with a type of data matrix called a Figure 4. Shared derived characters are plotted in (a) a character table to allow for the construction of a (b) cladogram (a type of phylogenetic tree). character table. The character table is employed to organize the relationships between organisms based on the character traits of interest. For example, traits may be morphological, (as shown in Fig. 4), or genetic. The y-axis specifies the characters that may or may not be present in all the organisms (taxa) listed on the x-axis. In this table, 0 denotes absence of a specified character, Biol 171L - SP18 Phylogenetics 11-4
and 1 indicates the presence of a specific character. This information can be visualized by constructing a cladogram, a bifurcated diagram illustrating historical relationships between organisms (Fig. 4B). Understanding phylogenies. Phylogenies are read like a family tree. The arrow pointing to the root in Figure 5 is the root, or trunk of the tree. It represents an ancestral lineage, or ancestor common to all ten species shown. As you read from the root, through the branches, and out to the tips of the branches (from the left to the right in the figure), you are visualizing evolution as it progressed from the past to the present. The evolution of Species 7, for instance, has been highlighted in red. Branches represent a speciation event. This bifurcation, as indicated by filled circles, represents speciation. A speciation event can give rise to two or more daughter lineages. It is during this even that a common ancestor gives rise to its descendants. Since phylogenies record shared ancestry between independent lineages, each lineage has an evolutionary history that is both unique to it alone, but also shared. Cousins, for example, share the same DNA from one set of grandparents, but (assuming a non-incestuous family history) not both. Similarly, species 7 has its own unique history that classifies it as Species 7. It also has a shared history with Species 8. Additionally, species 7 has its own unique ancestor, as does Species 8. But a common ancestor of Species 7 and 8 gave rise to these two species. Likewise, even further back in time, a common ancestor gave rise to Species 7, 8, 9 and 10. Phylogenies are useful in identifying clades, a grouping of a common ancestor and descendants from that ancestor. Each branch of a phylogeny represents a clade. Therefore, the following are clades: (a) Species 7, (b) Species 7 and 8, (c) Species 9 and 10, and (d) Species 7, 8, 9 and 10. Figure 5. A hypothetical phylogenic tree with 10 lineages. The evolutionary history of Species 7 is highlighted in red. Biol 171L - SP18 Phylogenetics 11-5
Branch length can be proportional to the number of state changes (a different morphological character or a genetic mutation) in that lineage, or it can represent chronological time. There are multiple methods that researchers use to construct trees, but for genetic data, all of these methods are based on the number of mutations (e.g., Single Nucleotide Polymorphisms, SNP s see Lab 8 for further information) among the taxa. Methods can be broadly categorized as distance-based (e.g., Neighbor Joining), or character-based (e.g., Maximum Likelihood). Distance-based methods effectively use percent similarity among sequences to construct trees, whereas character-based methods use advanced statistics that calculate the probability of one base pair changing to another base pair to find the most likely tree, given the data. In one of today s exercises, you will be asked to construct a simple tree (effectively a distance based method) based on the number of SNP s among different sequences. Then you will compare this tree to a Neighbor Joining tree published on the BOLD website. Trees are not ladders. Aristotle is attributed with the concept of The Great Chain of Being, also referred to as The Ladder of Life, to explain the hierarchy of matter, life, and God (Fig. 6). Aristotle proposed that both insentient things and sentient beings were consecutively ordered from the non-being to the highest being, with chains linking everything in between. This concept influenced much of Western thought and religion for centuries. Modern biologists however are careful to note that even though trends can be observed over evolutionary time (such as in fossil records), evolution is not directed towards some perfect organism. Evolution does not have a goal. Phylogenies are specifically drawn as a branching tree, rather than a ladder with rungs leading to a pre-determined state. So, even though trees are read in one direction, the direction does not correlate with advancement from an imperfect lower organism to a perfect higher organism. Figure 6. Depiction of the Great Chain of Being, a concept envisioned by Aristotle. PREPARATION FOR LAB 1. Read through Introduction 2. Research topics and terms that you are not familiar with or do not fully understand, particularly the italicized text in this lab manual. 3. Read through Laboratory Procedure. Biol 171L - SP18 Phylogenetics 11-6
LABORATORY PROCEDURE Exercise I. The Great Clade Race Goldsmith, D.W., 2003. The great clade race. The American Biology Teacher, 65(9):679-682 Overview: You will be given cards to organize into distinct groups. The process that you go through in order to develop a classification scheme mimics how evolutionary biologists use the method of cladistics to reconstruct cladograms. Part 1. Preparing for the race. 1. Work as a group with other students at your bench. 2. You will be given a set of eight index cards. 3. Devise a way to organize the eight cards into groups. You may develop as many groups as you want. Each card must only fit the criteria for one group. To do this, you will need to consider the following: how is each card similar from the other cards? Different? Part 2. The race. Imagine that there are eight participants in a race. Runners start at the same starting point. At a few points along the race, the racecourse splits, and runners must choose either of two forks. Each fork leads to a different finish line. In order to keep track of the path chosen by each runner, there are stations along these pathways that must be stamped as the runner passes the station. Now imagine the cards that you were given in Part 1 were the cards carried by the eight runners in this race. The cards have enough information for you to recreate the racecourse. 1. Draw a map of the racecourse based on the stamps present on each card. You will need to draw the racecourse like a cladogram, with branches labeled with the symbols from the cards. This cladogram should look similar to the ones depicted in Figures 3 and 5. 2. Use the following rules of the race to create your cladogram: a. The race must be completed by each runner. b. One pathway can only branch into two new pathways. c. A branched pathway stays branched. d. The stations for stamping cards are located along straight lines between the points that branch. Biol 171L - SP18 Phylogenetics 11-7
Exercise 2. Evolution of the Chocolate Bar Burks, R.L. and Boles, L.C., 2007. Evolution of chocolate bar: A creative approach to teaching phylogenetic relationships within Evolutionary biology. The American Biology Teacher 69(4):229-237 Overview. Candies will be used to model the phylogenetic relationships occurring in living organisms over time. The different characteristics of each candy are analogous to the traits of an organism. In this exercise, we can assume that consumer preference is the driving force ( selective pressure ) for the evolution of the chocolate bar. Part 1. Build a sample phylogeny 1. Obtain one package of candies. Remove each of the following candies: HB, MW, SB. 2. Among your group, discuss the characteristics of each candy. Decide what characters are useful in building cladograms. 3. Using these characters, propose relationships between each candy. What are some shared characteristics between the candies? 4. Construct one possible cladogram to explain the evolution of each candy bar. Can you infer the character of the common ancestor of these three candies? Table 1. Possible candies used in Exercise 2, and candy abbreviations used in this lab manual. CANDY Milky Way Twix Crunch Bar M&M s plain M&M s peanut Hershey Bar Snickers Kit Kat Bar ABBREVIATION MW TB CB MMPL MMPE HB SN KKB Part 2. Evolution of the Chocolate Bar 1. Use the remainder of the candies from your bag. Also include the three candies from Part 1 in this exercise. You should have six to eight candies to work with. 2. As a group, identify independent characters (characters that do not include other characters) for each candy. In the data matrix (Table 2), label each column with the character (the first one has been done for you). A data matrix is similar to a character table (Fig. 3A). 3. Score each of the candies in the data matrix. Enter 1 if the candy has the characteristic, or 0 if the candy does not have the characteristic (the first characteristic, Bar, has been partially completed). Biol 171L - SP18 Phylogenetics 11-8
4. Physically arrange the candies into phylogenetic trees on the surface of your bench. Go through a few arrangements that are logical and do not require many evolutionary steps. What character represents the common ancestor of the candies? 5. Decide on 2-4 arrangements. Then, diagram each arrangement into a corresponding phylogenetic tree. 6. Complete the following for each tree: a. Identify characters common to the lineage shown on your tree that was either gained or lost (representing evolutionary steps). Draw a small line perpendicular to the branch, and label with the corresponding step. b. Record how many steps it took to arrive at the completed tree. c. Biologists agree that the most likely tree is the one that was the simplest to build. Identify the most parsimonious tree (the tree with the least number of steps) constructed by your group. Exercise 3: Analyzing Sequences 1. Each table has been provided with 2 copies of sequence data. 2. Work in groups of two to count the number of SNP s in a sequence, using the sequence on top (the first sequence) as your reference. o Count the SNPs for each sequence on your data sheet. 3. Compare your answers with the other group at your table. Did you get the same number of SNP s as the other group? If not, make sure you check your data; it is critical to have an accurate count of the number of SNP s. 4. Draw a phylogenetic tree that illustrates the relationship among the species for which you have sequences. Consider the sequence at the top to be the oldest species. Remember that those species with fewer SNP s are considered more closely related. Consider how two species from the same genus might cluster together. 5. Compare your tree with the other group at your table. Did you get the same tree? If not, how are your trees different? Discuss your reasoning for your grouping with your tablemates. What are the pros and cons of each version? Remember, there are multiple ways to draw a tree; your trees don t have to be identical to be correct. 6. Download the sequence from Laulima. Copy and paste this sequence onto the BOLD website (http://www.boldsystems.org/index.php/ids_openidengine). Be sure to copy your sequence into the correct tab (hint: do you have a plant or an animal sequence?). On the results page for your sequence, click on the link that says Tree Based Identification. On the next page click on the link that says View Tree. 7. How does the tree that you drew compare to the tree on the BOLD website? Correct any errors that you made on your tree. Biol 171L - SP18 Phylogenetics 11-9
Table 2. Data matrix for candies. See Introduction (p. 11-4) for the meaning of 0 and 1. Bar MW 1 CB MMPL 0 MMPE HB SN PBC KKB TB Biol 171L - SP18 Phylogenetics 11-10
Lab 11 Homework Due Week of April 16, 2018 Part 1 Mastering Biology (47 points): Answer the questions entitled Lab 12 Describing Species Lab 12 Pre-lab on the Mastering Biology site. You have until the night before your lab at 11:59PM to complete these questions. This is your final Mastering Biology pre-lab assignment. Parts 2 & 3 Science Communication & Data Analysis (29 points): Answer the questions entitled Lab 11A. Phylogenetics HOMEWORK on the Mastering Biology site. You have until 6:00PM, the day of your lab (week of April 16, 2018) to complete these questions. Final Lab Report Due in-class, Week of April 23, 2018 Your final lab report is due in class, the week of April 23, 2018. To be eligible for full points, you must turn in all previous drafts, including your peer and TA comments. Be sure to ask your TA for help, if you have any questions. As a reminder, your final lab report is worth 15% of your final grade. Seventy points are assigned to the lab report (see the grading template posted in Lab 7). Biol 171L - SP18 Phylogenetics 11-11