Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood


 Linette Arnold
 2 years ago
 Views:
Transcription
1 Phylogeny Estimation and Hypothesis Testing using Maximum Likelihood For: Prof. Partensky Group: Jimin zhu Rama Sharma Sravanthi Polsani Xin Gong Shlomit klopman April
2 Table of Contents Introduction...3 Understanding Maximum Likelihood through Coin Tossing...4 The Phylogenetic Tree...9 Methods for constructing phylogenetic trees...10 DNA the basis of Molecular Phylogenetics Maximum Likelihood methods in DNA substitution Likelihood Ratio tests...29 Conclusion...30 Bibliography... 31
3 Introduction Phylogeny is defined as the study of the relationships between life forms. It is a part of a larger field called systematics, which also includes the field of taxonomy. The field of phylogenetics is rather recent and has received a huge push forward because of the development of stronger and faster computers. Phylogenetics help paleontologists, place organisms in proper context through out evolution. By using molecular Phylogenetics scientists trace the relationships between organisms by studying the similarities of differences in DNA and protein sequences. Today the question phylogenetics is addresses biological fields such as ecology, developmental biology, population genetics and epidemiology. A realistic and major obstacle that the field of phylogenetics is struggling with is reaching one all accepted answer to the process of evolution. The evolutionary biologist is often uncertain which method of analysis should be used to explain the data. Moreover, when the same data is examined by different phylogenetic methods the outcomes may be different and even contradicting. Evolutionary biologists are left asking the question which conclusion do I believe in? Today more than ever researchers are in agreement that arriving at a correct species lineage is a matter of statistic relationships. In this paper we will review a strong statistical method the Maximum likelihood Method for establishing the most likely phylogenetic tree of a given data set. The Maximum Likelihood method was first described in 1922, by English statistician R. A. Fisher. This method depends on a complete and specified data set and a probabilistic model that describes the data. By using this method the researcher ends up with the most likely explanation
4 (phylogenetic tree) for his data, but he can also learn about the evolutionary process through hypotheses testing. Recent advances in DNA substitution models have made Maximum likelihood a practical and dependable tool for phylogenetic analysis. Understanding Maximum Likelihood through Coin Tossing The aim of the Maximum Likelihood method is to find the parameter value(s) that make the observed data most likely. Basically, to choose the value of parameter that maximizes the probability of observing the data. Probability: Knowing parameters Prediction of outcome Likelihood: Observation of data Estimation of parameters We toss a coin a n times and record the number of heads. The probability distribution that describes this kind of scenario is the binomial probability distribution. The probability of observing h heads out of n tosses can be described as: Pr [h p, n]= n h ph 1 p n h The above equation can be considered in two parts. The second part involves the joint probability of obtaining h heads (and therefore nh tails) if a coin is tossed n times and has probability p of landing heads on any one toss (and therefore probability 1p of landing tails). Because we have assumed that each of the n trails is independent and with constant probability the joint probability of obtaining h heads and nh tails is simply the product of all the individual probabilities. For example: if we obtained 2 heads and 3 tails in 5 coin tosses, then it will simply be p 2 (1p) 3.
5 The first half of the binomial distribution function is concerned with the fact that there is more than 1 way to get, h heads and nh tails if a coin is tossed n times. Every one of the permutations is assumed to have equal probability of occurring  the binomial coefficient n! h! n h! The likelihood function for the above mentioned coin toss experiment in which a binomial distribution is assumed would be: L[ p h, n]= n h ph 1 p n h The likelihood function is obtained by multiplying the probability function for each toss of the coin. In the case of the coin toss experiment, we are assuming a //Bernoulli distribution for each coin flip, so the likelihood function becomes: n L p 1 n = i=1 f i p = p 1 1 p 1 1 p 2 1 p 1 2 p n 1 p 1 n = p 1 p 2 p n 1 p p p 1 n = p 1 2 n 1 p n 1 2 n n n i = p i i =1 1 p n i =1 Taking the natural log of the likelihood function does not change the value of p for which the likelihood is maximized. Applying log on both sides of the equation produces the following equation: n log L p 1, 2, n = i=1 n i log p n i log 1 p i=1
6 The following graphs show plots of the likelihood (L) as a function of the probability (p) for four different possible outcomes when tossing a coin ten times. Note that for the case in which 3 heads and 7 tails were the outcome of the experiment, that the likelihood appears to be maximized at p = 0.3. Similarly the likelihood appears to be maximized when p = 0.5 for the experiment outcome with 5 heads and 5 tails. p = 0.8 for the case of 8 heads and 2 tails. p = 0.9 for the case of 9 heads and 1 tail. The likelihood appears to be maximized when p is the proportion of the time that heads appeared in our experiment. This illustrates the brute force way to find the maximum likelihood estimation of p.
7 Say we toss a coin 100 times and observe 56 heads and 44 tails. Instead of assuming that p is 0.5, we want to find the Maximum Likelihood estimation for p. Then we want to ask whether or not this value differs significantly from How do we do this? We find the value for p that makes the observed data most likely. As mentioned, the observed data are now fixed. They will be constants that are plugged into our binomial probability model: n = 100 (total number of tosses) h = 56 (total number of heads) Imagine that p was 0.5. Plugging this value into our probability model gives: But what if p was 0.52 instead? L p=0.5 given data = 100! 56! 44! = L p=0.52 given data = 100! 56! 44! = So from this we can conclude that p is more likely to be 0.52 than 0.5. We can tabulate the likelihood for different parameter values to find the maximum likelihood estimate of p: p L p L
8 If we can graph the data across the full range of possible values for p: We see that the maximum likelihood estimate for p seems to be It is easy to see why this makes sense in this trivial example. The best estimate for p from any one sample is clearly going to be the proportion of heads observed in that sample. In such a simple example as this, nobody would use maximum likelihood estimation to evaluate p. But not all problems are this simple! As we shall see, the more complex the model and the greater the number of parameters, it often becomes very difficult to make even reasonable guesses at the Maximum Likelihood Estimate.
9 The Phylogenetic Tree Molecular phylogeny methods using a given set of aligned sequences reconstruct phylogenetic trees which aim at demonstrating the history of successive divergence which took place during the evolution, between the considered sequences and their common ancestor. There are two types of trees in terms of phylogenetic trees, rooted and unrooted. The difference between them in the biological meaning is that for the rooted, the ancestor of all taxa considered and/or known common ancestor; on the opposite, the unrooted tree is usually unknown common ancestor, but is a measure of similarity between them. Most of phylogenetic methods construct unrooted trees. A B A C B C D A. Unrooted tree B. Rooted tree A phylogenetic tree is a data structure, characterized by its topology (form) and its length (sum of its branch lengths) that stores information regarding the relationship of several species, which is a measure of homology. But computationally it is usually thought of in terms of an identity or similarity score d i, j between two entities (taxa, sequences, etc.).
10 Therefore, a phylogenetic tree is simply an arrangement of the data inherent within a multiple sequence alignment into a tree. This arrangement is useful to biologists because it organizes the species into their projected evolutionary history. Due to the construction method of a phylogenetic tree (as fig below), the node (C) represents a common ancestor. The distance (a and b) from the leaves to this common ancestor is a measure of the evolutionary distance (time) between the leaves (A and B). A B t a b C, the common ancestor of A & B The evolutionary distance between a & b is p= a b n, (n total number of species). The tree structure shows that two sequences are related, how they are related in the context of other sequences, and how distantly they are related. Methods for constructing phylogenetic trees
11 There are three major methods for constructing phylogenetic trees, parsimony, Maximum Likelihood and distance matrices. Parsimony and Maximum Likelihood are the methods directly based on the sequences, while distance matrices is the method indirectly based on sequences. It is expected that if sequences have strong phylogenetic relationship, different methods should show the same phylogenetic tree. However, it not always true. There is at present no statistical method which allows comparisons of trees obtained from different phylogenetic methods, nevertheless many studies have been made to compare the relative consistency of the existing methods. A node is said to be "consistent" if the elements it contains are found 95%  100% of the time. The consistency depends on many factors; among these are the topology and branch lengths of the tree. Distance matrix methods convert sequence data into a set of discrete pairwise distance values, arranged into a matrix. A tree is fit to this matrix using a cluster analysis method, which makes an estimation of the distance for each pair as the sum of branch lengths in the path from one species to another. This method is easy to perform, quick to calculate and proper for similar species. Parsimony determines the minimum number of changes (substitutions) required to transform a species to its nearest neighbor. Maximum Likelihood method was first used by Edward & Cavalli Sforza in 1964, but since computers were not as fast as they are today they found the problem too computationally difficult and ended up using parsimony. Neyman (1971) applied Maximum Likelihood method to molecular sequences (amino acids and nucleotides) using a simple model that assumed that substitutions occur independently among sites. It was Felsenstein in 1981 that implemented a general Maximum Likelihood method approach to nucleotide sequence data.
12 The data for molecular phylogenetic problems are the DNA nucleotides: A C T G. For every site, one of four possible nucleotides could be present and if we have four different sequences, there are 4 4 = 256 possible site patterns. This data can be described using a multinomial distribution. Pr [n 1, n 2, n r p 1 p r ]= n 1, n 2, n r n r i=1 p i n i where n 1, n 2,... n r n is the number of ways that n objects can be grouped into r classes; ni is the number of observations of the ith site pattern, and pi is the probability that site pattern i occurs. A Maximum Likelihood estimate of pi is: p i = n i n that is, the probability of the ith object equals the proportion of the time it was observed divided by the total number of objects. While this equation provides a model for the question that Edward & Cavalli Sforza raised in 1964, it is difficult to compute practically. First of all, the Equation cannot estimate tree topology, it ignores the other biological interesting parameters. To make it feasible to the general model on molecular phylogenetic estimation, especially for a tree structure, an equation has been introduced. r L,, 1 n = i=1 pr [ i,, ] Assuming independence among sites; where is a tree; v is a vector containing the lengths of the branches and is either v = (v 1 v 2^(n 2) ) for rooted trees or v = (v 1 v 2^(n 3) ) for unrooted trees (n is the number of sequences); and r is the total number of site patterns possible for s sequences. The multinomial coefficient (n/n 1,.n r ) is a constant and is usually disregarded when calculating
13 the likelihood of a tree. Also, to speed up computation of the likelihood, the product is taken only over observed site patterns. In the application of Maximum Likelihood one will have to check each of the (2n5)! /(2^(n3))(n 3)! Possible unrooted bifurcating trees in turn, and for rooted trees (2n3)!/(2^(n2))(n2)!, in order to find the Maximum Likelihood tree. Table 1. shows the number of untooted trees and rooted trees that in to be visited depending on the number of sequences. The procedure of visiting all possible trees and calculating the likelihood for each is computationally expensive. However, there are many shot cuts that can substantially speed up the procedure. An efficient method to calculate the likelihood by taking advantage of the tree topology when summing over all possible assignment of nucleotides to internal node was introduced by Felsenstein. Basically, with the Maximum Likelihood method, the bases (nucleotides or amino acids) of all sequences at each site are considered separately (as independent), and the loglikelihood of having these bases is computed for a given topology by using a particular probability model. This log
14 likelihood is added for all sites, and the sum of the loglikelihood is maximized to estimate the branch length of the tree. This procedure is repeated for all possible topologies, and the topology that shows the highest likelihood is chosen as the final tree. Maximum Likelihood is usually consistent and is extended to allow differences between the rate of transition and transversion. DNA the basis of Molecular Phylogenetics The DNA molecule is a polymer. The monomer units of DNA are called nucleotides, and the polymer is known as a "polynucleotide." Each nucleotide consists of a 5carbon sugar (deoxyribose), a nitrogen containing base that is attached to the sugar, and a phosphate group. There are four different types of nucleotides found in DNA, differing only in the nitrogenous base. The four nucleotides are given one letter abbreviations (the first letter of their name): Adenine, Guanine, Cytosine, Thymine. Adenine and guanine are purines. Purines are the larger of the two types of bases found in DNA. Structures are shown below:
15 The 9 atoms that make up the fused rings (5 carbon, 4 nitrogen) are numbered 19. All ring atoms lie in the same plane. Cytosine and thymine are pyrimidines. The 6 atoms (4 carbon, 2 nitrogen) are numbered 16. Like purines, all pyrimidine ring atoms lie in the same plane. The deoxyribose sugar of the DNA backbone has 5 carbons and 3 oxygens. The carbon atoms are numbered 1', 2', 3', 4', and 5' to distinguish from the numbering of the atoms of the purine and pyrmidine rings. The hydroxyl groups on the 5' and 3' carbons link to the phosphate groups to form the DNA backbone. Deoxyribose lacks an hydroxyl group at the 2'position when compared to ribose, the sugar component of RNA.
16 A nucleotide is a nucleoside with one or more phosphate groups covalently attached to the 3' and/ or 5'hydroxyl group(s). The DNA backbone is a polymer with an alternating sugarphosphate sequence. The deoxyribose sugars are joined at both the 3'hydroxyl and 5'hydroxyl groups to phosphate groups in ester links, also known as "phosphodiester" bonds. Below is an example a DNA backbone with the following sequence: 5'd(CGAAT). Here are some features of the 5'd(CGAAT) structure: Alternating backbone of deoxyribose and phosphodiester groups Chain has a direction (known as polarity), 5' to 3' from top to bottom Oxygens (in red) of phosphates are polar and negatively charged A, G, C, and T bases can extend away from chain, and stack atop each other Bases are hydrophobic
17 DNA is normally a double stranded macromolecule. Two polynucleotide chains, held together by weak thermodynamic forces. Shown below in the structure of a double stranded DNA molecule also called a double helix: The features of the double helix are: Two DNA strands form a helical spiral, winding around a helix axis in a righthanded spiral The two polynucleotide chains run in opposite directions The sugarphosphate backbones of the two DNA strands wind around the helix axis like the railing of a spiral staircase The bases of the individual nucleotides are on the inside of the helix, stacked on top of each other like the steps of a spiral staircase.
18 Within the DNA double helix, A forms 2 hydrogen bonds with T on the opposite strand, and G forms 3 hydrogen bonds with C on the opposite strand. Below are examples of base pairs as found within DNA double helix (dadt, dgdc) :
19 The dadt and dgdc base pairs are the same length, and occupy the same space within a DNA double helix. Therefore the DNA molecule has a uniform diameter. The dadt and dgdc base pairs can be found in any order within a DNA molecule. The DNA helix axis is most apparent from a view directly down the axis. The sugarphosphate backbone is on the outside of the helix where the polar phosphate groups (red and yellow atoms) can interact with the polar environment. The nitrogen (blue atoms) containing bases are inside, stacking perpendicular to the helix axis. Below is a view down the helix axis. Now that we are familiar with the structure of the DNA we can discover how Maximum Likelihood is used to learn about the process of evolution. Maximum Likelihood methods in DNA substitution Sequences diverge from a common ancestor because mutations occur. Some fractions of those mutations are fixed into the evolving population by selection and by chance, resulting in the substitution of one nucleotide for another at various sites. In order to reconstruct an evolutionary tree, we must make some assumptions about that substitution process. DNA substitution consists two basic elements, composition and process. Composition (or equilibrium frequency) is defined as the proportion of the four nucleotides in a sequence when the probability of the nucleotide s substitution approaches the equilibrium state. The process can be described by a matrix of numbers, describing how the nucleotides change from one to another. This process can be described as a Binomial distribution. Because there are a total of r = 4 s site patterns possible for s species. The probability of the nucleotide change from one to
20 another is very small and the number of total site patterns could be very large. The Poisson distribution which is a limited case of the Binomial distribution can explain this process. Therefore, all current implementations of likelihood estimation assume a timehomogeneous Poisson process to describe DNA substitutions. This Poisson process is timehomogeneous because the following assumptions are made. The occurrence of any nucleotide substitution in the time interval (a, b) is independent of the occurrence of any nucleotide substitution in the time interval (c, d), where (a, b) and (c, d) do not overlap. The probability of a nucleotide substitution in the time interval (t, t+h) is independent of t. The probability of a nucleotide substitution occurring in a small time interval is proportional to the length of the interval. Suppose the distribution of the number of substitutions s is a Poisson random variable with mean t during t units of time. The rate of substitutions (relative to the unit of time) at a given site is. The probability of s > 0 at a site in a time period t is. Thus, the probability of no changes occurring at a site is: and the probability for at least one substitution is: Therefore, the transition probabilities with twocharacter states can be described as follows (3) : P ij = 1 i i e t i j e t t j i e t j 1 j e
21 Where, i and j are the equilibrium frequencies of states i and j (i and j are nucleotides). is the rate of change from i to j and t is the arbitrary interval of time. A simplified model assumes that the probability of any nucleotide changing to any other nucleotide is equal. Let i be at some site at time t = 0. Let 0 = 1 = ¼, the probability at time t there will still be no change P(ii) and change P(ij) are P(ii) = ¼ + ¾(e  t ) and ¼  ¼ (e  t ), respectively. Fig1 is a plot that shows this simple case (let = 0.5) by Mathematica: (* no change vs time t*) Pii = Plot[1/4 + (3/4)* ^(0.5*t), {t, 0.001, 10}, Frame True, PlotRange All] (* change from i to j vs time t* ) Pij=Plot[1/4(1/4)* ^(0.5*t),{t,0.001,10},Frame > True, PlotRange All]
22 When t is very close to zero, the probability that the site has not changed, Pii, is very close to 1, while Pij, the probability that the nucleotide at that site has changed from i to j is very close to 0. As time goes on, both probabilities approach ¼ (equilibrium frequency). The time required for that to happen depends on. With the assumption above and assuming independence among sites, the likelihood of two DNA sequences, L(A, B t1, t2), can be represented as follows: 1 L= 16 n1 n2 1 3 e 4 ƛ t n1 1 e 4 ƛ t n2 where is the rate of change. n1 sites remain same and n2 sites change, t is the sum of t1 and t2, and can also be considered as branch length form node A to B. Suppose we have two nucleotide sequences, or simply, that only the nucleotides C and G, are present, for instance, that sequence might be: Sequence A Sequence B CCGGCCGCGCG CGGGCCGGCCG The maximum likelihood L(A, B t1, t2), of these two sequences can be calculated (using Mathematica) as follows: (*Likelihood vs. interval time t assume is known*) Clear [t1,t2]; ƛ=0.007 ; n1=8 ; n2=3 ; 1 Plot [ 16 n1 n2 1 3 e 4 ƛ t n1 1 e 4 ƛ t n2, t, 0,100, Frame True, PlotRange All ]
23 The graph above shows that t or (t1 + t2) at a Maximum Likelihood of 1.4E11 the distance between sequence A and sequence B is 17 arbitrary distance units. If the number of n1 and n2 are changed to 3 and 8, respectively. The graph shown above becomes the graph below, and the curve is not as sharp as before. The Maximum Likelihood shifts to the right dramatically, and the t that will produce the best maximum Likelihood is unclear (probably the sequence is too short) It makes sense that the more identical the sites are the shorter the distance between them. It also makes sense that when is smaller the distance between two sequences is larger. The comparison
24 of the Maximum Likelihood with small and large is shown in the following two graphs. With a small (0.002) We see the curve shift to the right compared to the curve with large (0.007). = 0.007: = 0.002:
25 The algorithm used to find the Maximum Likelihood for an arbitrary number of sequences is more complicated. A primary treebuilding program that uses the Maximum Likelihood method is called PAUP* is Available at PAUP is used for reconstruction of phylogenetic trees based on nucleic acid alignments. The simplest case of DNA substitution is the model called JukesCantor (JC69) model (4). JC69 model assumes that the base frequencies are equal ( A = C = G = T) and that the rate of change from one nucleotide to another is the same for all possible changes. However, the JC69 model, like several other models, is simply a special case of a general model of DNA substitution for which the instantaneous rate matrix Q has the following form: r 2 C r 4 G r 6 T r P ij = 1 A r 8 G r 10 T r 3 A r 7 C r 12 T r 5 A r 9 C r 11 G The rows and columns are ordered A, C, G, and T. The matrix gives the rate of change from nucleotide i (arranged along the rows) to nucleotide j (along the columns). The r stands for rate of change and stands for base frequencies. The commonly used models of DNA substitution are based of this general model. The following models differ in the settings of two parameters they are: nucleotide frequency and rate of change between two nucleotides.
26 The nucleotide frequency is rather simple to understand. The models can be divided into two groups, those that assume equal nucleotide frequency (JC69, K80, K3ST, SYM), and those that assume unequal nucleotide frequency (F81, HKY85, TrN, GTR). The rate of change parameter is more complex and assigns equal or different probabilities to the changes (mutations) between two nucleotides. There can be two kinds of mutations between the four DNA nucleotides: Transition or Transversion. A transition is a mutation between two nucleotides from the same chemical/ structural group. For example: purine transition G A pyrimidine transition C T A transversion can occur between any two of the four nucleotides. The list below describes in greater detail the different mutations (r1 r12) that are used in the above models. The nucleotide pairs are formed from the general Q matrix discussed previously.
27 r1: C>A transversion r2: A>C transversion r3: G>A purine transition r4: A>G purine transition r5: T>A transversion r6: A>T transversion r7: G>C transversion r8: C>G transversion r9: T>C pyrimidine transition r10: C>T pyrimidine transition r11: T>G transversion r12: G>T transversion Here is a simplified table for the eight models used in DNA substitution listing their nucleotide frequencies and rates of change: Models Nucleotide frequencies Rate of nucleotide change Transitions Transversions JC69 Equal Equal rates K80 Equal Different rates K3ST Equal Different rates F81 Different Equal rates HKY85 Different Different rates TrN Different Rate for transition G<>A Same for all transversions Rate of transition T<>C SYM Equal Rate for transition G<>A Rate for transition T<>C GTR Different Rate for transition G<>A Rate for transition T<>C Rate for transversion C<>A Rate for transversion T<>A Rate for transversion G<>C Rate for transversion T<>G Rate for transversion C<>A Rate for transversion T<>A Rate for transversion G<>C Rate of transversion T<>G Using these models and the Maximum Likelihood method it is possible to learn about the evolutionary process. This study is called hypotheses testing and will be explained in the following section.
28 Likelihood Ratio tests All methods in the field of phylogenetics make assumptions about the process of evolution. A common assumption for example, is a bifurcating tree to describe the relationship between species. Another common assumption that is used in DNA substitution is that the nucleotide sites in a DNA are independent of each other. The Maximum Likelihood method makes explicit assumptions about an evolutionary process and that allows the scientist not only to estimate the relationship between species, but also to learn about the process of evolution through hypothesis testing. When using hypothesis testing, we establish a hypothesis, which is referred to as the null hypothesis and an alternative hypothesis, which usually contradicts the null hypothesis. Next, we use two competing models to try and explain our data (one model will fit the null hypothesis and the other will fit the alternative hypothesis). We compute the Maximum Likelihood given our data for each model and find their ratio as follows: ML of null hypothsis Ratio= ML of alternative hypothsis When the ratio is less then one the null hypothesis is rejected (and the alternative hypothesis is accepted). When the ration is greater than one the alternative hypothesis is rejected (the null hypothesis is accepted). To help us understand species divergence through DNA substitution, we can subject different models of DNA substitution to hypothesis testing. For this purpose, we introduce 8 different DNA
29 substitution models. These models are subsets of the general substitution model (with the Q matrix), and are subsets of one another. This means that we can produce a hierarchy of the different models starting with the model that has only one questionable parameter all the way through to the most complex model. Using likelihood ratio tests, we can determine whether a particular parameter (also be called hypothesis) provides a significant increase in the likelihood. For example: the first hypothesis questions equal base frequencies. The null hypothesis will state that the 4 nucleotides can be found in equal frequencies in a given sequence (model JC69). The alternative hypothesis will say the exact opposite that the nucleotide frequencies are not the same in a given sequence (model f81). To reach a decision we subject the two models to a likelihood ratio test. In this case, the ratio was less then one so the null hypothesis is rejected and we conclude that the four nucleotides appear in different frequencies in a given sequence. Conclusion The use of the Maximum Likelihood method has become a practical tool in phylogenetics because of recent advances in DNA substitution models, computer programs and faster computers. One of the strengths of the Maximum Likelihood Method of phylogenetic estimation is the ease in which hypotheses can be formulated and tested. Since the assumptions used in this method are explicit and clear statistical tests of phylogenetics can be formulated. The Maximum Likelihood method provides a uniformed framework for the evaluation of the alternative hypotheses. Likelihood ratio tests can be applied to questions for which the null distribution is difficult to determine analytically.
30 Bibliography 1. Ewens, W.J. & Grant G.R Statictical Methods in Bioinformatics, An introduction. New York: Springer. 2. Huelsenbeck, J.P & Crandall, K.A Phylogeny estimation and hypothesis testing using maximum likelihood. Annu. Rev. Ecol. Syst. 28: Jukes TH, Cantor CR Evolution of protein molecules. In Mammalian Protein Metabolism, ed. HM Munro, pp New York: Academic 4. Kimura M Asimple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16: Kimura M Estimation of evolutionary distances between homologous nucleotide sequences. Proc. Natl. Acad. Sci. USA 78: Felsenstein J Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17: Hasegawa M, Kishino K, Yano T Dating the humanape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22: Tamura K, Nei M Estimation of the number of nucleotide substitutions inthe control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol.Evol. 10: Zharkikh A Estimation of evolutionary distances between nucleotide sequences. J. Mol. Evol. 39: Lanave C, Preparata G, Saccone C, Serio G A new method for calculating evolutionary substitution rates. J. Mol.Evol. 20:86 93
Lecture Notes: Markov chains
Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36
More informationLecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22
Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and
More informationAmira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut UniversityEgypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationLecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26
Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and
More informationWhat Is Conservation?
What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.
More informationDr. Amira A. ALHosary
Phylogenetic analysis Amira A. ALHosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut UniversityEgypt Phylogenetic Basics: Biological
More informationHow should we go about modeling this? Model parameters? Time Substitution rate Can we observe time or subst. rate? What can we observe?
How should we go about modeling this? gorilla GAAGTCCTTGAGAAATAAACTGCACACACTGG orangutan GGACTCCTTGAGAAATAAACTGCACACACTGG Model parameters? Time Substitution rate Can we observe time or subst. rate? What
More informationLesson Overview The Structure of DNA
12.2 THINK ABOUT IT The DNA molecule must somehow specify how to assemble proteins, which are needed to regulate the various functions of each cell. What kind of structure could serve this purpose without
More informationLecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)
Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from
More informationEstimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6057
Estimating Phylogenies (Evolutionary Trees) II Biol4230 Thurs, March 2, 2017 Bill Pearson wrp@virginia.edu 42818 Jordan 6057 Tree estimation strategies: Parsimony?no model, simply count minimum number
More informationLetter to the Editor. Department of Biology, Arizona State University
Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istancebased methods Ultrametric Additive: UPGMA Transformed istance NeighborJoining Characterbased Maximum Parsimony Maximum Likelihood
More informationPhylogenetics: Building Phylogenetic Trees
1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should
More informationThere are two types of polysaccharides in cell: glycogen and starch Starch and glycogen are polysaccharides that function to store energy Glycogen Glucose obtained from primary sources either remains soluble
More informationSubstitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A
GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC
More informationBerg Tymoczko Stryer Biochemistry Sixth Edition Chapter 1:
Berg Tymoczko Stryer Biochemistry Sixth Edition Chapter 1: Biochemistry: An Evolving Science Tips on note taking... Remember copies of my lectures are available on my webpage If you forget to print them
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics  in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa.  before we review the
More informationMaximum Likelihood Until recently the newest method. Popularized by Joseph Felsenstein, Seattle, Washington.
Maximum Likelihood This presentation is based almost entirely on Peter G. Fosters  "The Idiot s Guide to the Zen of Likelihood in a Nutshell in Seven Days for Dummies, Unleashed. http://www.bioinf.org/molsys/data/idiots.pdf
More informationMassachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution
Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral
More informationPHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD
Annu. Rev. Ecol. Syst. 1997. 28:437 66 Copyright c 1997 by Annual Reviews Inc. All rights reserved PHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD John P. Huelsenbeck Department of
More informationPhylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University
Phylogenetics: Building Phylogenetic Trees COMP 571  Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary
More information"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky
MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION  theory that groups of organisms change over time so that descendeants differ structurally
More informationMaximum Likelihood Tree Estimation. Carrie Tribble IB Feb 2018
Maximum Likelihood Tree Estimation Carrie Tribble IB 200 9 Feb 2018 Outline 1. Tree building process under maximum likelihood 2. Key differences between maximum likelihood and parsimony 3. Some fancy extras
More informationCh 3: Chemistry of Life. Chemistry Water Macromolecules Enzymes
Ch 3: Chemistry of Life Chemistry Water Macromolecules Enzymes Chemistry Atom = smallest unit of matter that cannot be broken down by chemical means Element = substances that have similar properties and
More informationPhylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University
Phylogenetics: Distance Methods COMP 571  Spring 2015 Luay Nakhleh, Rice University Outline Evolutionary models and distance corrections Distancebased methods Evolutionary Models and Distance Correction
More informationMutation models I: basic nucleotide sequence mutation models
Mutation models I: basic nucleotide sequence mutation models Peter Beerli September 3, 009 Mutations are irreversible changes in the DNA. This changes may be introduced by chance, by chemical agents, or
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2018 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2018 University of California, Berkeley D.D. Ackerly Feb. 26, 2018 Maximum Likelihood Principles, and Applications to
More information2: CHEMICAL COMPOSITION OF THE BODY
1 2: CHEMICAL COMPOSITION OF THE BODY Although most students of human physiology have had at least some chemistry, this chapter serves very well as a review and as a glossary of chemical terms. In particular,
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationConstructing Evolutionary/Phylogenetic Trees
Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distancebased methods Ultrametric Additive: UPGMA Transformed Distance NeighborJoining Characterbased Maximum Parsimony Maximum Likelihood
More informationBioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics
Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods
More informationThe biomolecules of terrestrial life
Functional groups in biomolecules Groups of atoms that are responsible for the chemical properties of biomolecules The biomolecules of terrestrial life Planets and Astrobiology (20172018) G. Vladilo 1
More informationBIOCHEMISTRY GUIDED NOTES  AP BIOLOGY
BIOCHEMISTRY GUIDED NOTES  AP BIOLOGY ELEMENTS AND COMPOUNDS  anything that has mass and takes up space.  cannot be broken down to other substances.  substance containing two or more different elements
More informationTree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny
More informationSome of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!
Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks! Paul has many great tools for teaching phylogenetics at his web site: http://hydrodictyon.eeb.uconn.edu/people/plewis
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationMATHEMATICAL MODELS  Vol. III  Mathematical Modeling and the Human Genome  Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationLecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).
1 Bioinformatics: Indepth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff
More informationPhylogenetic inference
Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis) advantages of different information types
More informationInDel 35. InDel 89. InDel 35. InDel 89. InDel InDel 89
Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic
More informationWeek 5: Distance methods, DNA and protein models
Week 5: Distance methods, DNA and protein models Genome 570 February, 2016 Week 5: Distance methods, DNA and protein models p.1/69 A tree and the expected distances it predicts E A 0.08 0.05 0.06 0.03
More informationEVOLUTIONARY DISTANCES
EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:
More informationBase pairing in DNA.
TFY4215 Kjemisk fysikk og kvantemekanikk Våren 2007 Chemical physics Exercise 3 To be delivered by: Tuesday 08.05. Base pairing in DNA. Introduction DNA, deoxyribonucleic acid are the molecules that contain
More informationMicrobiology with Diseases by Taxonomy, 5e (Bauman) Chapter 2 The Chemistry of Microbiology. 2.1 Multiple Choice Questions
Microbiology with Diseases by Taxonomy, 5e (Bauman) Chapter 2 The Chemistry of Microbiology 2.1 Multiple Choice Questions 1) Which of the following does not contribute significantly to the mass of an atom?
More informationChapter 7: Models of discrete character evolution
Chapter 7: Models of discrete character evolution pdf version R markdown to recreate analyses Biological motivation: Limblessness as a discrete trait Squamates, the clade that includes all living species
More informationEfficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used
Molecular Phylogenetics and Evolution 31 (2004) 865 873 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev Efficiencies of maximum likelihood methods of phylogenetic inferences when different
More informationLecture 4. Models of DNA and protein change. Likelihood methods
Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/39
More informationProbabilistic modeling and molecular phylogeny
Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical
More informationEvolutionary Models. Evolutionary Models
Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment
More informationImproving divergence time estimation in phylogenetics: more taxa vs. longer sequences
Mathematical Statistics Stockholm University Improving divergence time estimation in phylogenetics: more taxa vs. longer sequences Bodil Svennblad Tom Britton Research Report 2007:2 ISSN 6500377 Postal
More informationModel Worksheet Teacher Key
Introduction Despite the complexity of life on Earth, the most important large molecules found in all living things (biomolecules) can be classified into only four main categories: carbohydrates, lipids,
More informationLie Markov models. Jeremy Sumner. School of Physical Sciences University of Tasmania, Australia
Lie Markov models Jeremy Sumner School of Physical Sciences University of Tasmania, Australia Stochastic Modelling Meets Phylogenetics, UTAS, November 2015 Jeremy Sumner Lie Markov models 1 / 23 The theory
More informationDNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi
DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics :
More informationC3020 Molecular Evolution. Exercises #3: Phylogenetics
C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 15 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from
More informationCHAPTERS 2425: Evidence for Evolution and Phylogeny
CHAPTERS 2425: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology
More informationClass 10 Heredity and Evolution CBSE Solved Test paper1
Class 10 Heredity and Evolution CBSE Solved Test paper1 Q.1.What is heredity? Ans : Heredity refers to the transmission of characters or traits from the parents to their offspring. Q.2. Name the plant
More informationReading for Lecture 13 Release v10
Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................
More informationName: Date: Period: Biology Notes: Biochemistry Directions: Fill this out as we cover the following topics in class
Name: Date: Period: Biology Notes: Biochemistry Directions: Fill this out as we cover the following topics in class Part I. Water Water Basics Polar: part of a molecule is slightly, while another part
More informationKaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging
Method KaKs Calculator: Calculating Ka and Ks Through Model Selection and Model Averaging Zhang Zhang 1,2,3#, Jun Li 2#, XiaoQian Zhao 2,3, Jun Wang 1,2,4, Gane KaShu Wong 2,4,5, and Jun Yu 1,2,4 * 1
More informationAdditive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.
Additive distances Let T be a tree on leaf set S and let w : E R + be an edgeweighting of T, and assume T has no nodes of degree two. Let D ij = e P ij w(e), where P ij is the path in T from i to j. Then
More informationThe body has three primary lines of defense against changes in hydrogen ion concentration in the body fluids.
ph and Nucleic acids Hydrogen Ion (H+) concentration is precisely regulated. The H+ concentration in the extracellular fluid is maintained at a very low level, averaging 0.00000004Eq/L. normal variations
More information7. Tests for selection
Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group PaulFlechsigInstitute for Brain Research www. nowicklab.info
More informationHow Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building
How Molecules Evolve Guest Lecture: Principles and Methods of Systematic Biology 11 November 2013 Chris Simon Approaching phylogenetics from the point of view of the data Understanding how sequences evolve
More information1. (5) Draw a diagram of an isomeric molecule to demonstrate a structural, geometric, and an enantiomer organization.
Organic Chemistry Assignment Score. Name Sec.. Date. Working by yourself or in a group, answer the following questions about the Organic Chemistry material. This assignment is worth 35 points with the
More informationFull file at
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) Which of the following is an uncharged particle found in the nucleus of 1) an atom and which has
More informationLab 9: Maximum Likelihood and Modeltest
Integrative Biology 200A University of California, Berkeley "PRINCIPLES OF PHYLOGENETICS" Spring 2010 Updated by Nick Matzke Lab 9: Maximum Likelihood and Modeltest In this lab we re going to use PAUP*
More informationConcepts and Methods in Molecular Divergence Time Estimation
Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks
More information"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B Spring 2011 University of California, Berkeley B.D. Mishler Feb. 1, 2011. Qualitative character evolution (cont.)  comparing
More informationSequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the loglikelihood ratio of
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0706 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationCh. 2 BASIC CHEMISTRY. Copyright 2010 Pearson Education, Inc.
Ch. 2 BASIC CHEMISTRY Matter and Composition of Matter Definition: Anything that has mass and occupies space Matter is made up of elements An element cannot be broken down by ordinary chemical means Atoms
More informationIntegrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley. Parsimony & Likelihood [draft]
Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2016 University of California, Berkeley K.W. Will Parsimony & Likelihood [draft] 1. Hennig and Parsimony: Hennig was not concerned with parsimony
More informationAlgorithmic Methods Welldefined methodology Tree reconstruction those that are welldefined enough to be carried out by a computer. Felsenstein 2004,
Tracing the Evolution of Numerical Phylogenetics: History, Philosophy, and Significance Adam W. Ferguson Phylogenetic Systematics 26 January 2009 Inferring Phylogenies Historical endeavor Darwin 1837
More informationTheory of Evolution Charles Darwin
Theory of Evolution Charles arwin 85859: Origin of Species 5 year voyage of H.M.S. eagle (8336) Populations have variations. Natural Selection & Survival of the fittest: nature selects best adapted varieties
More informationMolecular Evolution, course # Final Exam, May 3, 2006
Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least
More informationPHYLOGENY AND SYSTEMATICS
AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study
More informationBayesian Models for Phylogenetic Trees
Bayesian Models for Phylogenetic Trees Clarence Leung* 1 1 McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada ABSTRACT Introduction: Inferring genetic ancestry of different species
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11 THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationBayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies
Bayesian Inference using Markov Chain Monte Carlo in Phylogenetic Studies 1 What is phylogeny? Essay written for the course in Markov Chains 2004 Torbjörn Karfunkel Phylogeny is the evolutionary development
More informationA Statistical Test of Phylogenies Estimated from Sequence Data
A Statistical Test of Phylogenies Estimated from Sequence Data WenHsiung Li Center for Demographic and Population Genetics, University of Texas A simple approach to testing the significance of the branching
More informationChapter Two: The Chemistry of Biology. The molecules of life make up the structure of cells Chemistry of biological molecule
Chapter Two: The Chemistry of Biology The molecules of life make up the structure of cells Chemistry of biological molecule Atoms and Elements: Atoms: The basic units of all matter, containing three major
More informationPhylogenetics. BIOL 7711 Computational Bioscience
Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium
More information9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)
I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by
More informationChapter 002 The Chemistry of Biology
Chapter 002 The Chemistry of Biology Multiple Choice Questions 1. Anything that occupies space and has mass is called A. Atomic B. Living C. Matter D. Energy E. Space 2. The electrons of an atom are A.
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, MariePauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationEVOLUTIONARY DISTANCE MODEL BASED ON DIFFERENTIAL EQUATION AND MARKOV PROCESS
August 0 Vol 4 No 0050 JATIT & LLS All rights reserved ISSN: 998645 wwwjatitorg EISSN: 8795 EVOLUTIONAY DISTANCE MODEL BASED ON DIFFEENTIAL EUATION AND MAKOV OCESS XIAOFENG WANG College of Mathematical
More informationPhylogenetic methods in molecular systematics
Phylogenetic methods in molecular systematics Niklas Wahlberg Stockholm University Acknowledgement Many of the slides in this lecture series modified from slides by others www.dbbm.fiocruz.br/james/lectures.html
More informationBiological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor
Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationEstimating Divergence Dates from Molecular Sequences
Estimating Divergence Dates from Molecular Sequences Andrew Rambaut and Lindell Bromham Department of Zoology, University of Oxford The ability to date the time of divergence between lineages using molecular
More information2) Matter composed of a single type of atom is known as a(n) 2) A) element. B) mineral. C) electron. D) compound. E) molecule.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) Which of the following is a particle found in the nucleus of an atom and that has no electrical
More informationIn: P. Lemey, M. Salemi and A.M. Vandamme (eds.). To appear in: The. Chapter 4. Nucleotide Substitution Models
In: P. Lemey, M. Salemi and A.M. Vandamme (eds.). To appear in: The Phylogenetic Handbook. 2 nd Edition. Cambridge University Press, UK. (final version 21. 9. 2006) Chapter 4. Nucleotide Substitution
More informationEvolutionary Analysis of Viral Genomes
University of Oxford, Department of Zoology Evolutionary Biology Group Department of Zoology University of Oxford South Parks Road Oxford OX1 3PS, U.K. Fax: +44 1865 271249 Evolutionary Analysis of Viral
More informationThe Molecules of Life Chapter 2
The Molecules of Life Chapter 2 Core concepts 1.The atom is the fundamental unit of matter. 2.Atoms can combine to form molecules linked by chemical bonds. 3.Water is essential for life. 4.Carbon is the
More informationChapter 2 The Chemistry of Biology. Dr. Ramos BIO 370
Chapter 2 The Chemistry of Biology Dr. Ramos BIO 370 2 Atoms, Bonds, and Molecules Matter  all materials that occupy space and have mass Matter is composed of atoms. Atom simplest form of matter not divisible
More informationBioinformatics. Scoring Matrices. David Gilbert Bioinformatics Research Centre
Bioinformatics Scoring Matrices David Gilbert Bioinformatics Research Centre www.brc.dcs.gla.ac.uk Department of Computing Science, University of Glasgow Learning Objectives To explain the requirement
More informationBiology I Fall Semester Exam Review 2014
Biology I Fall Semester Exam Review 2014 Biomolecules and Enzymes (Chapter 2) 8 questions Macromolecules, Biomolecules, Organic Compunds Elements *From the Periodic Table of Elements Subunits Monomers,
More informationName: Date: Hour: Unit Four: Cell Cycle, Mitosis and Meiosis. Monomer Polymer Example Drawing Function in a cell DNA
Unit Four: Cell Cycle, Mitosis and Meiosis I. Concept Review A. Why is carbon often called the building block of life? B. List the four major macromolecules. C. Complete the chart below. Monomer Polymer
More information