The nonsynonymous/synonymous substitution rate ratio versus the radical/conservative replacement rate ratio in the evolution of mammalian genes

Size: px
Start display at page:

Download "The nonsynonymous/synonymous substitution rate ratio versus the radical/conservative replacement rate ratio in the evolution of mammalian genes"

Transcription

1 MBE Advance Access published July, The nonsynonymous/synonymous substitution rate ratio versus the radical/conservative replacement rate ratio in the evolution of mammalian genes Kousuke Hanada 1,, Shin-Han Shiu and Wen-Hsiung Li 1 * 1. Department of Ecology and Evolution, University of Chicago, Chicago, IL 0. Department of Plant Biology, Michigan State University, East Lansing, MI Running head: Ka/Ks ratio vs radical/conservative replacement ratio Key words: positive selection, radical substitution, conservative substitution, classification of amino acids, development. *Corresponding author. Wen-Hsiung Li, Department of Ecology and Evolution, University of Chicago 01 East th Street, Chicago, IL, 0, USA. Tel: Fax: whli@uchicago.edu The Author 00. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please journals.permissions@oxfordjournals.org 1

2 Abstract There are two ways to infer selection pressures in the evolution of protein-coding genes: the nonsynonymous and synonymous substitution rate ratio (K A /K S ) and the radical and conservative amino acid replacement rate ratio (K R /K C ). Since the K R /K C ratio depends on the definition of radical and conservative changes in the classification of amino acids, we develop an amino acid classification that maximizes the correlation between K A /K S and K R /K C. An analysis of, orthologous gene groups among five mammalian species shows that our classification gives a significantly higher correlation coefficient between the two ratios than those of existing classifications. However, there are many orthologous gene groups with a low K A /K S but a high K R /K C ratio. Examining the functions of these genes, we found an overrepresentation of functional categories related to development. To determine if the over-representation is stage specific, we examined the expression patterns of these genes at different developmental stages of the mouse. Interestingly, these genes are highly expressed in the early middle stage of development (Blastocyst to Amnion). It is commonly thought that developmental genes tend to be conservative in evolution, but some molecular changes in developmental stages should have contributed to morphological divergence in adult mammals. Therefore, we propose that the relaxed pressures indicated by the K R /K C ratio but not by K A /K S in the early middle stage of development may be important for the morphological divergence of mammals at the adult stage, while purifying selection detected by K A /K S occurs in the early middle developmental stage.

3 Introduction Selection pressure on protein-coding sequences is commonly estimated by the ratio of the nonsynonymous substitution rate (K A ) to the synonymous substitution rate (K S ) (Li and Gojobori 1; Hughes and Nei 1). If the K A /K S ratio is higher than 1, positive selection is assumed to have occurred during the evolution of the sequence. The ratio of the radical replacement rate (K R ) to the conservative replacement rate (K C ) has also been used to detect positive selection (Hughes, Ota, and Nei ). The K R /K C ratio is useful for examining selection pressure in distantly related protein-coding sequences because the K A /K S ratio cannot be accurately estimated in this case due to saturation of K S (Gojobori 1; Smith and Smith 1). Since there are two ways of inferring selection pressure on a sequence, an open question is whether these two approaches give the same conclusion or not. Zhang (000) and Smith (00) found that K A /K S is correlated with K R /K C based on the amino acid classification that considers polarity and volume, using mammalian and Drosophila genes. However, there are several types of amino acid classifications and it is not known which classification gives a K R /K C measure that best correlates with the K A /K S ratio. Therefore, we do not know the degree of correlation between the two ratios in general. In the present study, we searched for an amino acid classification that gives the best correlation between the two ratios. This amino acid classification is useful because the K R /K C ratio based on this classification can identify genes undergoing similar selection pressures inferred by the K A /K S ratio between distant protein-coding sequences. Another issue is that it is likely that the two ratios are not completely correlated even if the amino acid classification that gives the maximum correlation between the two ratios is used. To address the differences between the selection pressures inferred by K A /K S and K R /K C in the evolution of mammalian genes, we examined functions of genes that showed different selection pressures inferred by the two ratios, using Gene Ontology (GO) categories and expression data of a representative mammal, the mouse. Materials & Methods Construction of orthologous groups cdna data of five mammalian species were retrieved from the Ensembl database ( Homo sapiens (NCBI.may), Pan troglodytes (CHIMP1.may), Mus

4 musculus (NCBIM.may), Rattus norvegicus (RGSC..may) and Canis familiaris (BROADD1.may). Reciprocal best hits between every combination of two species were identified with Blastp (Altschul et al. 1). For sequences that are reciprocal best hits among all species combinations (Fig. 1A), they were considered as an orthologous group among the five species., putative orthologous groups were constructed according to the procedure. To further verify the, orthologous groups, phylogenetic trees were constructed using the protein sequence alignments of members in an orthologous group by the neighbor-joining (NJ) method (Saitou and Nei 1; Thompson, Higgins, and Gibson 1). When the topology was different from the species tree, the data set was removed from the orthologous data (Fig 1B). The total number of orthologous groups was reduced to,. For the numbers of nucleotide sites used in these orthologous groups, the interquartile range (%-%) and the median number of nucleotide sites are.0-.0 and.0, respectively. The orthologous gene groups in the five mammalian species were determined as follows. The orthologous gene data were carefully constructed to reduce errors for estimating nucleotide and amino substitutions. Only segments aligned among the five species without any gaps were used for the calculation of the K A /K S and K R /K C ratios. Estimation of K A /K S and K R /K C in each orthologous gene set A phylogenetic tree was reconstructed for each orthologous gene group by the NJ method (Saitou and Nei 1). The ancestral sequence was inferred at each node in the phylogenetic tree using the maximum likelihood method (Yang, Kumar, and Nei 1). The transition/transversion ratio was estimated in each orthologous group and the ratio was then used to estimate K A and K S in all branches in the phylogenetic tree by the modified Nei-Gojobori method (Zhang, Rosenberg, and Nei 1). The sums of K A and K S of all branches were used to determine the K A /K S ratio in each orthologous gene group. Radical and conservative changes were defined by a classification (A) that gave the best correlation between K R /K C and K A /K S and also by three previous classifications with respect to the chemical properties: (B) polarity and volume, (C) charge and aromaticity, and (D) charge and polarity (Zhang 000; Hanada, Gojobori, and Li 00) (Table 1). These so-called physicochemical properties (aromaticity, charge, polarity, and volume) are thought to be relevant for the evolution of proteins (Grantham 1; Miyata, Miyazawa, and Yasunaga 1). Based on the ancestral sequences inferred at all nodes in the phylogenetic tree of each orthologous group,

5 K R and K C were estimated in all branches in the phylogenetic tree by the Zhang method (Zhang 000). The sums of branch lengths that reflected K R and K C were used to determine the K R /K C ratio in each orthologous group. Average K A, K S, K R and K C in each branch of species tree among, orthologous groups are given in Supplement A. Construction of a new amino acid classification To estimate the average K A /K S ratio for each amino acid replacement, we collected from the orthologous gene groups the amino acid replacements that had occurred. The average K A /K S ratio for each type of amino acid replacement is defined to be the average K A /K S ratio in the collected orthologous gene groups. The average K A /K S ratios were estimated for each of the kinds of amino acid replacement occurring by single nucleotide substitution. Since the amino acid replacement having a low (high) K A /K S ratio should tend to be a conservative (radical) change in the highly associated classification, radical and conservative scores were numbered for types of amino acid replacement in descending (ascending) order of K A /K S (Supplement B). Using the radical and conservative scores for the types of amino acid replacement, we calculated the totals of radical and conservative scores for each amino acid classification. To find an amino acid classification that would give the maximum correlation between K R /K C and K A /K S, amino acids were classified into two to five groups in all possible combinations and we identified the classification with the highest score. The new classification is regarded as the amino acid classification that can more adequately characterize the relationship between K A /K S and K R /K C. Functional categories by Gene Ontology. Orthologous gene groups with the top and bottom % K A /K S or K R /K C values were considered as relaxed selection groups and purifying selection groups, respectively. Under this classification, there are four possible combinations for the orthologous gene groups: (1) relaxed selection groups inferred by both K A /K S and K R /K C (a high K A /K S and a high K R /K C ), () purifying selection groups inferred by both K A /K S and K R /K C (a low K A /K S and a low K R /K C ), () relaxed and purifying selection groups inferred by K A /K S and by K R /K C (a high K A /K S and a low K R /K C ), respectively, and () purifying selection and relaxed selection groups inferred by K A /K S and by K R /K C (a low K A /K S and a high K R /K C ), respectively. Gene Ontology (GO) assignments for the mouse genes were obtained from the mouse genome database (Hill et al. 00). To simplify functional interpretation, we used the GO

6 categories of biological processes from top to the th depth in the hierarchy. The expected proportion of each GO category assigned by the mouse genes was compared with the observed proportion of each GO category assigned by the mouse genes of orthologous gene groups undergoing different selection pressures by the chi-square test. When the observed proportion is significantly higher than the expected proportion in a given GO category (P<0.0), the hierarchical pathways from the root to the overrepresented GO category were shown by the Graphviz software ( The expression pattern at a developmental stage. The mouse expression dataset covering various stages of mouse development (Ringwald et al. 001) was used to determine the relationships between gene expression and the nature of selection pressure as determined by the K A /K S and K R /K C measures. Among different selection pressures, we compared the expression bias of genes at a developmental stage by the following equation. Nob. Nob. R = = Nex. Pall Nselected For a particular developmental stage, Nob. and Nex. are the observed and expected numbers of expressed genes that experienced purifying or relaxed selection pressure at the developmental stage, Pall is the proportion of all mouse genes expressed at a given developmental stage, and Nselected is the total number of genes undergoing each of four types of selection pressures. Nex. was calculated by multiplying Pall by Nselected. Results A new classification of amino acids To find a new classification that yields the maximum correlation between K A /K S and K R /K C, we first constructed all possible combinations in which the 0 amino acids can be classified into two to five groups. Second, a table representing the average K A /K S ratio for each type of amino acid replacement was constructed to see what kinds of amino acid replacements more adequately characterize the K A /K S ratio (Supplement B). Based on the table, a new classification of amino acids with a higher correlation between the K A /K S ratio and the radical or conservative change was constructed (Classification A in Table 1). In the new classification, amino acids are classified into basic, acidic and neutral charges. The aromatic amino acids belong

7 to the group of the basic charges because one of the aromatic amino acids has a basic charge. The amino acids with neutral charge are classified into small and large volumes that fall into distinct groups. Consequently, this new classification seems to be constructed with respect to the chemical properties of charge, aromaticity and volume. Correlation between K R /K C and K A /K S Using three existing amino acid classifications and our new classification, we estimated four K R /K C ratios for each orthologous gene group. The four K R /K C ratios were significantly positively correlated with each other (P < 0.01) (Table ). In terms of the correlation between K R /K C and K A /K S, the correlation coefficient in the new classification (A, r=0. Table ) was expected to be the highest among the four chemical classifications because the new classification (A) was constructed by the chemical properties associated with the K A /K S ratio. In fact, the correlation coefficient between K A /K S and K R /K C based on the new classification is significantly higher than those based on the other three classifications (P < 0.01), though the other three K R /K C ratios are also each positively correlated with the K A /K S ratio (P < 0.01) (Fig.). However, even under the new classification, which gives the highest correlation between the two ratios, the correlation coefficient is less than 0., indicating that selective pressures inferred by the K R /K C ratio and by the K A /K C ratio differ substantially. In particular, there are many orthologous gene groups with a low K A /K S and a high K R /K C ratio (Fig. ). These orthologous gene groups have likely undergone relaxed selection in radical amino acid substitutions as indicated by the K R /K C ratio but experienced purifying selection in non-synonymous changes as indicated by the K A /K S ratio. Overrepresented functional categories undergoing opposite selection pressures inferred by two ratios There are four types of selection pressure experienced by the orthologous gene groups. The number of orthologous gene groups that experienced relaxed or purifying selection pressures in the two ratios is shown in Table and the gene lists are given in Supplement C. Since K A /K S was on the whole positively correlated with K R /K C in mammals, a larger number of groups undergoing the same selection pressures in the two ratios was found in the comparison with the number of groups that underwent the opposite selection pressures in the two ratios. The groups with the opposite selection pressures are only found in a high K R /K C and a low K A /K S ratio.

8 To assess the functions of groups that underwent different selection pressures, we examined significantly overrepresented Gene Ontology (GO) categories of mouse genes in orthologous gene groups subject to each type of selection pressures (Fig., Supplement D). The overrepresented functions of genes with a high K R /K C and a high K A /K S ratio are related to "response to stimulus and physiological process. In particular, several functions related to defense response can be clearly found in these genes. Since genes related to defense response are in general accepted as genes undergoing positive selection, these results seem biologically reasonable. On the other hand, the overrepresented functions of genes with a low K A /K S ratio are related to development. This result is also reasonable because most of the genes related to development are subject to purifying selection based on the K A /K S ratio between distantly related species (Powell et al. 1; Slack, Holland, and Graham 1). However, it is unclear whether this holds true if the K R /K C ratio is used to evaluate the selection pressure in genes related to development. In genes with a low K A /K S ratio, sex determination and cell differentiation are overrepresented in genes with a high and a low K R /K C ratio, respectively (Fig. ). Sex determination is likely conserved among mammals but cell differentiation may be required to be somewhat different among mammals for the divergent evolution seen in mammals. Thus, it is possible that relaxed selection pressures indicated by the K R /K C ratio may be one of the important factors for the evolution in mammals. To further examine the different gene functions between the high and low K R /K C ratios in mammalian development, we examined the expression of mouse genes with different selection pressures using the mouse expression dataset covering various stages of development (Fig. A, B). Genes subject to purifying selection based on both ratios are expressed at high levels at the early developmental stages (One cell egg to Blastocyst). On the other hand, genes subject to purifying selection indicated by K A /K S but relaxed selection indicated by K R /K C were expressed predominantly in the early middle stage of development (Blastocyst to Amnion). The relaxed pressures indicated solely by the K R /K C ratio in the early middle stage of development may be important for the divergent evolution in mammals. Discussion The key finding of the present study is that a positive correlation between K A /K S and

9 K R /K C at a genomic scale is observed in all amino acid classifications, indicating that the two tests of selection pressure give similar conclusions in mammalian evolution. In particular, the K R /K C ratio of the new classification is useful for estimating selection pressure between distantly related sequences (Gojobori 1; Smith and Smith 1). Since the evolutionary rate of synonymous substitution is much faster than that of nonsynonymous substitution, K S is often saturated between distant sequences. On the other hand, the K R /K C ratio is estimated by only amino acid replacements and the evolutionary rate of amino acid replacement is much slower than that of synonymous substitution, so that the K R /K C ratio can be estimated for distant sequences. Thus, the new classification (A) can produce a useful K R /K C ratio for estimating the selection pressure in distant sequences. It should be noted that several reports had classified amino acid replacements into radical and conservative amino acid changes by the likelihood of amino acid replacements and estimated selection pressures by such radical and conservative amino acid changes (Tang et al. 00; Gojobori et al. 00). On the other hand, in the present study, we defined radical and conservative changes by the likelihoods of nonsynonymous and synonymous substitutions. Therefore, the selection pressures inferred by radical and conservative changes under our definition should more likely lead to similar selection pressures inferred by the K A /K S ratio. However, a major limitation in substituting K R /K C for K A /K S is that, even when we used the new classification aimed at maximizing the correlation between K R /K C and K A /K S, the correlation between K R /K C and K A /K S is still less than 0.. There are potentially two reasons why the two ratios are not highly correlated. One reason is biological. For some genes, K R /K C may not be related to the type of natural selection identified by K A /K S. The other reason is technical. In the computation of the K R /K C ratio, radical and conservative changes were defined as amino acid replacements between groups and within groups, respectively. In view of the fact that the radical and conservative changes are defined to be always 0 or 1, the K R /K C ratio may not fully represent the selection pressure of amino acid replacements. We note that there are many orthologous gene groups with a low K A /K S and a high K R /K C as outliers. To address the opposite selection pressures, we examined the functions of mouse genes and found that functional categories related to development were overrepresented in these genes. We then examined these gene expression patterns at different developmental stages. The mouse genes that underwent such selection pressures tend to be over-expressed in the early

10 middle developmental stages. Richardson (1) proposed that the early middle developmental stages were important for speciation of mammals because these are the stages when many adult traits are specified even if these stages were conservative in the morphological level. Therefore, we propose that the relaxed selection pressures indicated by K R /K C but not by K A /K S in the early middle developmental stages may be important for the morphological divergence of mammals at the adult stage, while purifying selection detected by K A /K S tends to occur in the early middle developmental stages. The differences in the selection pressures assessed by K A /K S and K R /K C indicate that, although genes involved in development have strong constraints in amino acid substitutions, radical changes in the substitutions permitted are likely important for developmental divergence of adult mammals. Thus, opposite selection pressures in the two ways might play an important role in the evolution of genes related to development in mammals. In summary, we inferred, orthologous gene groups in mammalian species in a stringent manner. K R /K C is positively correlated with K A /K S. The correlation was observed in each of four chemical classifications taking account of aromaticity, charge, polarity or volume. In particular, the chemical classification for aromaticity, charge and volume led to the highest correlation between these two ratios. Moreover, the genes with high K R /K C but low K A /K S were over-represented with genes expressed at a high level in the early middle developmental stages. The selection pressures at these developmental stages may be important for the morphological diversification of mammals. 1 Acknowledgements We thank the members of our laboratories for valuable comments and discussion. This study was supported by NIH grant (GM0) to W.-H. L. and an NSF grant (DBI-01) to S.-H. S.

11 Table 1. Four classifications of amino acids. Classification A by the maximum correlation with the K A /K S ratio Neutral & small (MW*: -1) A N C G P S T Neutral & large I L M V (MW*: 1-0) Basic acid, Aromaticity & Relatively small R Q H K F W Y (MW*: -1) Acidic charge & Relatively large D E (MW*: 1-1) Classification B by polarity & volume Special C Neutral and Small A G P S T Polar & relatively small N D Q E Polar & relatively large R H K Nonpolar & relatively small I L M V Nonpolar & relatively large F W Y Classification C by charge & aromatic Acidic D E Neutral & No aromaticity Q A V L I C S T N G P M Neutral & Aromaticity F Y W Basic K R H Classification D by charge & polarity Neutral & Polarity S T Y C N Q Acidic & Polarity D E Basic & Polarity K R H No polarity G A V L I F P M W *MW: Molecular weight

12 Table Correlation coefficient between K R /K C and K A /K S. K R /K C (Classification B) K R /K C (Classification C) K R /K C (Classification D) K A /K S K R /K C (Classification A) K R /K C (Classification B) K R /K C (Classification C) K R /K C (Classification D) 0. 1

13 Table The number of orthologous groups undergoing different selection pressures Orthologous groups under relaxed selection indicated by K R /K C ( % top of K R /K C ratio) Orthologous groups under purifying selection indicated by K R /K C ( % bottom of K R /K C ratio) Orthologous groups under relaxed selection indicated by K A /K S ( % top of K A /K S ratio) Orthologous groups under purifying selection indicated by K A /K S ( % bottom of K A /K S ratio) 0 1 1

14 Literature Cited Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res :-0. Gojobori, J., H. Tang, J. M. Akey, and C. I. Wu. 00. Adaptive evolution in humans revealed by the negative correlation between the polymorphism and fixation phases of evolution. Proc Natl Acad Sci U S A :0-1. Gojobori, T. 1. Codon substitution in evolution and the "saturation" of synonymous changes. Genetics :-. Grantham, R. 1. Amino acid difference formula to help explain protein evolution. Science 1:-. Hanada, K., T. Gojobori, and W. H. Li. 00. Radical amino acid change versus positive selection in the evolution of viral envelope proteins. Gene :-. Hill, D. P., J. A. Blake, J. E. Richardson, and M. Ringwald. 00. Extension and integration of the gene ontology (GO): combining GO vocabularies with external vocabularies. Genome Res 1:1-11. Hughes, A. L., and M. Nei. 1. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature :1-. Hughes, A. L., T. Ota, and M. Nei.. Positive Darwinian selection promotes charge profile diversity in the antigen-binding cleft of class I major-histocompatibility-complex molecules. Mol Biol Evol :1-. Li, W. H., and T. Gojobori. 1. Rapid evolution of goat and sheep globin genes following gene duplication. Mol Biol Evol 1:-. Miyata, T., S. Miyazawa, and T. Yasunaga. 1. Two types of amino acid substitutions in protein evolution. J Mol Evol 1:1-. Powell, J. R., A. Caccone, J. M. Gleason, and L. Nigro. 1. Rates of DNA evolution in Drosophila depend on function and developmental stage of expression. Genetics 1:1-. Ringwald, M., J. T. Eppig, D. A. Begley, J. P. Corradi, I. J. McCright, T. F. Hayamizu, D. P. Hill, J. A. Kadin, and J. E. Richardson The Mouse Gene Expression Database (GXD). Nucleic Acids Res :-1. Saitou, N., and M. Nei. 1. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol :0-. Slack, J. M., P. W. Holland, and C. F. Graham. 1. The zootype and the phylotypic stage. Nature 1:0-. Smith, J. M., and N. H. Smith. 1. Synonymous nucleotide divergence: what is "saturation"? Genetics 1:-. Smith, N. G. 00. Are radical and conservative substitution rates useful statistics in molecular evolution? J Mol Evol :-. Tang, H., G. J. Wyckoff, J. Lu, and C. I. Wu. 00. A universal evolutionary index for amino acid changes. Mol Biol Evol 1:1-1. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res :-0. Yang, Z., S. Kumar, and M. Nei. 1. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics :-. Zhang, J Rates of conservative and radical nonsynonymous nucleotide substitutions in mammalian nuclear genes. J Mol Evol 0:-. Zhang, J., H. F. Rosenberg, and M. Nei. 1. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci U S A :0-1. 1

15 Figure legends Fig. 1. Construction of ortholog data. The similarity search was conducted by Blastp as in Fig. 1A. Reciprocal best hits were identified between every pair of species. The number of reciprocal best hits between pair of species is shown between each pair of species. When sequences reciprocally had the best hits among the five species, the sequences were considered an orthologous gene among the five species. A phylogeny was then generated for each orthologous gene group. When the phylogeny of the orthologs from the five species is different from the topology of the species phylogeny, this putative ortholog was removed from the ortholog data. The species phylogeny is shown in Fig. 1B. Fig.. Correlation between K A /K S and K R /K C. The X-axis is the K A /K S ratio and the Y-axis is the K R /K C ratio. The ratios were computed based on classification A (r=0.) (A); classification B (r=0.) (B); classification C (0.) (C); and classification D (r=0.) (D). Fig.. Overrepresented functions in genes with a low K A /K S and a high K R /K C and genes with a low K A /K S and a low K R /K C ratio. The arrowheads point to subcategories. (A) Categories overrepresented in genes with a low K A /K S and a high K R /K C are in black circles (P < 0.0). (B) Categories overrepresented in genes with a low K A /K S and a low K R /K C are in black circles (P < 0.0). Fig.. Expression levels of genes with different selection pressures in each developmental stage. (A) The X-axis indicates the developmental stage. The names of each stage are as follows: 1 (One cell egg), (Beginning of cell division), (Morula), (Advanced division/segmentation), (Blastocyst), (Implantation), (Formation of egg cylinder), (Differentiation of egg cylinder), (Advanced endometrial reaction; prestreak), (Amnion; midstreak), (Neural plate, presomite; no allantoic bud), 1 (First somites; late head fold), 1 (Turning), 1 (Formation & closure anterior neuropore), 1 (Formation of posterior neuropore, forelimb bud), 1 (Closure post. neuropore, hindlimb & tail bud), 1 (Deep lens indentation), 1 (Closure lens vesicle), 1 1

16 (Complete separation of lens vesicle), 0 (Earliest sign of fingers), 1 (Anterior footplate indented, marked pinna), (Fingers separate distally), (Toes separate), (Reposition of umbilical hernia), (Fingers and toes joined together), (Long whiskers) and (Postnatal development). The Y-axis indicates the normalized difference of expressed genes between genes undergoing a selection pressure and all genes. (B) The sliding window analysis ( stages) was conducted based on (A). The X-axis is the mean of normalized difference in five developmental stages. The Y-axis indicates the average normalized difference in each window. 1

17 FIG 1 A B 1, Human, Mouse Human Dog 1, Chimpanzee 1, 1,0, 1, 1,0 1, Mouse, Rat Rat Chimpanzee Dog

18 FIG KR/KC ratio (Classification C) KR/KC ratio (Classification A) A KA/KS ratio KA/KS ratio C KR/KC ratio (Classification D) KR/KC ratio (Classification B) KA/KS ratio KA/KS ratio B D

19 FIG A embryonic_development (sensu_metazoa) embryonic_development axis_specification development pattern_specification anterior/posterior pattern_formation biological_process cellular_process cell_differentiation epidermal_cell_differentiation regulation_of_biological process regulation_of_development regulation_of_epidermis development regulation_of_binding B sex_determination male_sex_determination development pattern_specification axis_specification biological_process growth developmental_growth blastocyst_growth response_to_stimulus behavior visual_behavior

20 FIG A Genes under purifying selection indicated by both K A /K S and K R /K C Genes under purifying selection indicated by K A /K S but relaxed selection indiated by K R /K C. B Genes under relaxed selection indicated by both K A /K S and K R /K C

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Fixation of Deleterious Mutations at Critical Positions in Human Proteins

Fixation of Deleterious Mutations at Critical Positions in Human Proteins Fixation of Deleterious Mutations at Critical Positions in Human Proteins Author Sankarasubramanian, Sankar Published 2011 Journal Title Molecular Biology and Evolution DOI https://doi.org/10.1093/molbev/msr097

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Variance and Covariances of the Numbers of Synonymous and Nonsynonymous Substitutions per Site

Variance and Covariances of the Numbers of Synonymous and Nonsynonymous Substitutions per Site Variance and Covariances of the Numbers of Synonymous and Nonsynonymous Substitutions per Site Tatsuya Ota and Masatoshi Nei Institute of Molecular Evolutionary Genetics and Department of Biology, The

More information

Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A

Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus A J Mol Evol (2000) 51:423 432 DOI: 10.1007/s002390010105 Springer-Verlag New York Inc. 2000 Maximum Likelihood Estimation on Large Phylogenies and Analysis of Adaptive Evolution in Human Influenza Virus

More information

RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG

RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG RELATING PHYSICOCHEMMICAL PROPERTIES OF AMINO ACIDS TO VARIABLE NUCLEOTIDE SUBSTITUTION PATTERNS AMONG SITES ZIHENG YANG Department of Biology (Galton Laboratory), University College London, 4 Stephenson

More information

Supplementary Information for Hurst et al.: Causes of trends of amino acid gain and loss

Supplementary Information for Hurst et al.: Causes of trends of amino acid gain and loss Supplementary Information for Hurst et al.: Causes of trends of amino acid gain and loss Methods Identification of orthologues, alignment and evolutionary distances A preliminary set of orthologues was

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

POPULATION GENETICS Biology 107/207L Winter 2005 Lab 5. Testing for positive Darwinian selection

POPULATION GENETICS Biology 107/207L Winter 2005 Lab 5. Testing for positive Darwinian selection POPULATION GENETICS Biology 107/207L Winter 2005 Lab 5. Testing for positive Darwinian selection A growing number of statistical approaches have been developed to detect natural selection at the DNA sequence

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution

Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Likelihood Ratio Tests for Detecting Positive Selection and Application to Primate Lysozyme Evolution Ziheng Yang Department of Biology, University College, London An excess of nonsynonymous substitutions

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

Sequence Database Search Techniques I: Blast and PatternHunter tools

Sequence Database Search Techniques I: Blast and PatternHunter tools Sequence Database Search Techniques I: Blast and PatternHunter tools Zhang Louxin National University of Singapore Outline. Database search 2. BLAST (and filtration technique) 3. PatternHunter (empowered

More information

Application of new distance matrix to phylogenetic tree construction

Application of new distance matrix to phylogenetic tree construction Application of new distance matrix to phylogenetic tree construction P.V.Lakshmi Computer Science & Engg Dept GITAM Institute of Technology GITAM University Andhra Pradesh India Allam Appa Rao Jawaharlal

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Letter to the Editor. Department of Biology, Arizona State University

Letter to the Editor. Department of Biology, Arizona State University Letter to the Editor Traditional Phylogenetic Reconstruction Methods Reconstruct Shallow and Deep Evolutionary Relationships Equally Well Michael S. Rosenberg and Sudhir Kumar Department of Biology, Arizona

More information

Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations

Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations Variances of the Average Numbers of Nucleotide Substitutions Within and Between Populations Masatoshi Nei and Li Jin Center for Demographic and Population Genetics, Graduate School of Biomedical Sciences,

More information

Molecular Coevolution of the Vertebrate Cytochrome c 1 and Rieske Iron Sulfur Protein in the Cytochrome bc 1 Complex

Molecular Coevolution of the Vertebrate Cytochrome c 1 and Rieske Iron Sulfur Protein in the Cytochrome bc 1 Complex Molecular Coevolution of the Vertebrate Cytochrome c 1 and Rieske Iron Sulfur Protein in the Cytochrome bc 1 Complex Kimberly Baer *, David McClellan Department of Integrative Biology, Brigham Young University,

More information

Phylogeny and the Tree of Life

Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS Masatoshi Nei" Abstract: Phylogenetic trees: Recent advances in statistical methods for phylogenetic reconstruction and genetic diversity analysis were

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

More information

Taming the Beast Workshop

Taming the Beast Workshop Workshop and Chi Zhang June 28, 2016 1 / 19 Species tree Species tree the phylogeny representing the relationships among a group of species Figure adapted from [Rogers and Gibbs, 2014] Gene tree the phylogeny

More information

Chapter 16: Reconstructing and Using Phylogenies

Chapter 16: Reconstructing and Using Phylogenies Chapter Review 1. Use the phylogenetic tree shown at the right to complete the following. a. Explain how many clades are indicated: Three: (1) chimpanzee/human, (2) chimpanzee/ human/gorilla, and (3)chimpanzee/human/

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Proceedings of the SMBE Tri-National Young Investigators Workshop 2005

Proceedings of the SMBE Tri-National Young Investigators Workshop 2005 Proceedings of the SMBE Tri-National Young Investigators Workshop 25 Control of the False Discovery Rate Applied to the Detection of Positively Selected Amino Acid Sites Stéphane Guindon,* Mik Black,*à

More information

Letter to the Editor. Temperature Hypotheses. David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons

Letter to the Editor. Temperature Hypotheses. David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons Letter to the Editor Slow Rates of Molecular Evolution Temperature Hypotheses in Birds and the Metabolic Rate and Body David P. Mindell, Alec Knight,? Christine Baer,$ and Christopher J. Huddlestons *Department

More information

Basic Local Alignment Search Tool

Basic Local Alignment Search Tool Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26

Phylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26 Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,

More information

PHYLOGENY AND SYSTEMATICS

PHYLOGENY AND SYSTEMATICS AP BIOLOGY EVOLUTION/HEREDITY UNIT Unit 1 Part 11 Chapter 26 Activity #15 NAME DATE PERIOD PHYLOGENY AND SYSTEMATICS PHYLOGENY Evolutionary history of species or group of related species SYSTEMATICS Study

More information

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION Annu. Rev. Genomics Hum. Genet. 2003. 4:213 35 doi: 10.1146/annurev.genom.4.020303.162528 Copyright c 2003 by Annual Reviews. All rights reserved First published online as a Review in Advance on June 4,

More information

Single alignment: Substitution Matrix. 16 march 2017

Single alignment: Substitution Matrix. 16 march 2017 Single alignment: Substitution Matrix 16 march 2017 BLOSUM Matrix BLOSUM Matrix [2] (Blocks Amino Acid Substitution Matrices ) It is based on the amino acids substitutions observed in ~2000 conserved block

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010 BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for

More information

Comparing Genomes! Homologies and Families! Sequence Alignments!

Comparing Genomes! Homologies and Families! Sequence Alignments! Comparing Genomes! Homologies and Families! Sequence Alignments! Allows us to achieve a greater understanding of vertebrate evolution! Tells us what is common and what is unique between different species

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

From DNA to Diversity

From DNA to Diversity From DNA to Diversity Molecular Genetics and the Evolution of Animal Design Sean B. Carroll Jennifer K. Grenier Scott D. Weatherbee Howard Hughes Medical Institute and University of Wisconsin Madison,

More information

Temporal Trails of Natural Selection in Human Mitogenomes. Author. Published. Journal Title DOI. Copyright Statement.

Temporal Trails of Natural Selection in Human Mitogenomes. Author. Published. Journal Title DOI. Copyright Statement. Temporal Trails of Natural Selection in Human Mitogenomes Author Sankarasubramanian, Sankar Published 2009 Journal Title Molecular Biology and Evolution DOI https://doi.org/10.1093/molbev/msp005 Copyright

More information

FUNDAMENTALS OF MOLECULAR EVOLUTION

FUNDAMENTALS OF MOLECULAR EVOLUTION FUNDAMENTALS OF MOLECULAR EVOLUTION Second Edition Dan Graur TELAVIV UNIVERSITY Wen-Hsiung Li UNIVERSITY OF CHICAGO SINAUER ASSOCIATES, INC., Publishers Sunderland, Massachusetts Contents Preface xiii

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)

More information

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Phylogeny: building the tree of life

Phylogeny: building the tree of life Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

BIOINFORMATICS LAB AP BIOLOGY

BIOINFORMATICS LAB AP BIOLOGY BIOINFORMATICS LAB AP BIOLOGY Bioinformatics is the science of collecting and analyzing complex biological data. Bioinformatics combines computer science, statistics and biology to allow scientists to

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,

More information

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Ze Zhang,* Z. W. Luo,* Hirohisa Kishino,à and Mike J. Kearsey *School of Biosciences, University of Birmingham,

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Concepts and Methods in Molecular Divergence Time Estimation

Concepts and Methods in Molecular Divergence Time Estimation Concepts and Methods in Molecular Divergence Time Estimation 26 November 2012 Prashant P. Sharma American Museum of Natural History Overview 1. Why do we date trees? 2. The molecular clock 3. Local clocks

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Chapter 19: Taxonomy, Systematics, and Phylogeny

Chapter 19: Taxonomy, Systematics, and Phylogeny Chapter 19: Taxonomy, Systematics, and Phylogeny AP Curriculum Alignment Chapter 19 expands on the topics of phylogenies and cladograms, which are important to Big Idea 1. In order for students to understand

More information

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny

GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny Phylogenetics and chromosomal synteny of the GATAs 1273 GATA family of transcription factors of vertebrates: phylogenetics and chromosomal synteny CHUNJIANG HE, HANHUA CHENG* and RONGJIA ZHOU* Department

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches Int. J. Bioinformatics Research and Applications, Vol. x, No. x, xxxx Phylogenies Scores for Exhaustive Maximum Likelihood and s Searches Hyrum D. Carroll, Perry G. Ridge, Mark J. Clement, Quinn O. Snell

More information

Classification and Phylogeny

Classification and Phylogeny Classification and Phylogeny The diversity it of life is great. To communicate about it, there must be a scheme for organization. There are many species that would be difficult to organize without a scheme

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand

More information

Piecing It Together. 1) The envelope contains puzzle pieces for 5 vertebrate embryos in 3 different stages of

Piecing It Together. 1) The envelope contains puzzle pieces for 5 vertebrate embryos in 3 different stages of Piecing It Together 1) The envelope contains puzzle pieces for 5 vertebrate embryos in 3 different stages of development. Lay out the pieces so that you have matched up each animal name card with its 3

More information

The Phylogenetic Handbook

The Phylogenetic Handbook The Phylogenetic Handbook A Practical Approach to DNA and Protein Phylogeny Edited by Marco Salemi University of California, Irvine and Katholieke Universiteit Leuven, Belgium and Anne-Mieke Vandamme Rega

More information

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1

Tiffany Samaroo MB&B 452a December 8, Take Home Final. Topic 1 Tiffany Samaroo MB&B 452a December 8, 2003 Take Home Final Topic 1 Prior to 1970, protein and DNA sequence alignment was limited to visual comparison. This was a very tedious process; even proteins with

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Toni Gabaldón Contact: tgabaldon@crg.es Group website: http://gabaldonlab.crg.es Science blog: http://treevolution.blogspot.com

More information

Accuracy and Power of the Likelihood Ratio Test in Detecting Adaptive Molecular Evolution

Accuracy and Power of the Likelihood Ratio Test in Detecting Adaptive Molecular Evolution Accuracy and Power of the Likelihood Ratio Test in Detecting Adaptive Molecular Evolution Maria Anisimova, Joseph P. Bielawski, and Ziheng Yang Department of Biology, Galton Laboratory, University College

More information

A profile-based protein sequence alignment algorithm for a domain clustering database

A profile-based protein sequence alignment algorithm for a domain clustering database A profile-based protein sequence alignment algorithm for a domain clustering database Lin Xu,2 Fa Zhang and Zhiyong Liu 3, Key Laboratory of Computer System and architecture, the Institute of Computing

More information

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University Phylogenetics: Building Phylogenetic Trees COMP 571 - Fall 2010 Luay Nakhleh, Rice University Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary

More information

BIOINFORMATICS: An Introduction

BIOINFORMATICS: An Introduction BIOINFORMATICS: An Introduction What is Bioinformatics? The term was first coined in 1988 by Dr. Hwa Lim The original definition was : a collective term for data compilation, organisation, analysis and

More information

Homology and Information Gathering and Domain Annotation for Proteins

Homology and Information Gathering and Domain Annotation for Proteins Homology and Information Gathering and Domain Annotation for Proteins Outline Homology Information Gathering for Proteins Domain Annotation for Proteins Examples and exercises The concept of homology The

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

Bioinformatics Exercises

Bioinformatics Exercises Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species

Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species Cubic Spline Interpolation Reveals Different Evolutionary Trends of Various Species Zhiqiang Li 1 and Peter Z. Revesz 1,a 1 Department of Computer Science, University of Nebraska-Lincoln, Lincoln, NE,

More information

Efficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used

Efficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used Molecular Phylogenetics and Evolution 31 (2004) 865 873 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev Efficiencies of maximum likelihood methods of phylogenetic inferences when different

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Molecular Clocks. The Holy Grail. Rate Constancy? Protein Variability. Evidence for Rate Constancy in Hemoglobin. Given

Molecular Clocks. The Holy Grail. Rate Constancy? Protein Variability. Evidence for Rate Constancy in Hemoglobin. Given Molecular Clocks Rose Hoberman The Holy Grail Fossil evidence is sparse and imprecise (or nonexistent) Predict divergence times by comparing molecular data Given a phylogenetic tree branch lengths (rt)

More information

I. Short Answer Questions DO ALL QUESTIONS

I. Short Answer Questions DO ALL QUESTIONS EVOLUTION 313 FINAL EXAM Part 1 Saturday, 7 May 2005 page 1 I. Short Answer Questions DO ALL QUESTIONS SAQ #1. Please state and BRIEFLY explain the major objectives of this course in evolution. Recall

More information

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc Supplemental Data. Perea-Resa et al. Plant Cell. (22)..5/tpc.2.3697 Sm Sm2 Supplemental Figure. Sequence alignment of Arabidopsis LSM proteins. Alignment of the eleven Arabidopsis LSM proteins. Sm and

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Natural selection on the molecular level

Natural selection on the molecular level Natural selection on the molecular level Fundamentals of molecular evolution How DNA and protein sequences evolve? Genetic variability in evolution } Mutations } forming novel alleles } Inversions } change

More information

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT 3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode

More information

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES Molecular Biology-2018 1 Definitions: RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES Heterologues: Genes or proteins that possess different sequences and activities. Homologues: Genes or proteins that

More information

Chapter 26 Phylogeny and the Tree of Life

Chapter 26 Phylogeny and the Tree of Life Chapter 26 Phylogeny and the Tree of Life Chapter focus Shifting from the process of how evolution works to the pattern evolution produces over time. Phylogeny Phylon = tribe, geny = genesis or origin

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information