Research. Yupeng Wang 1,2, Xiyin Wang 1,3, Tae-Ho Lee 1, Shahid Mansoor 1 and Andrew H. Paterson 1. Summary. Introduction

Size: px
Start display at page:

Download "Research. Yupeng Wang 1,2, Xiyin Wang 1,3, Tae-Ho Lee 1, Shahid Mansoor 1 and Andrew H. Paterson 1. Summary. Introduction"

Transcription

1 Research Gene body methylation shows distinct patterns associated with different gene origins and duplication modes and has a heterogeneous relationship with gene expression in Oryza sativa (rice) Yupeng Wang 1,2, Xiyin Wang 1,3, Tae-Ho Lee 1, Shahid Mansoor 1 and Andrew H. Paterson 1 1 Plant Genome Mapping Laboratory, University of Georgia, Athens, GA 30602, USA; 2 Computational Biology Service Unit, Cornell University, Ithaca, NY 14853, USA; 3 Center for Genomics and Computational Biology, School of Life Sciences, School of Sciences, Hebei United University, Tangshan, Hebei, , China Author for correspondence: Andrew H. Paterson Tel: paterson@plantbio.uga.edu Received: 31 October 2012 Accepted: 6 December 2012 doi: /nph Key words: correlation analysis, DNA methylation, gene body, gene duplication, gene origin, Ks, rice (Oryza sativa). Summary Whole-genome duplication (WGD) has been recurring and single-gene duplication is also widespread in angiosperms. Recent whole-genome DNA methylation maps indicate that gene body methylation (i.e. of coding regions) has a functional role. However, whether gene body methylation is related to gene origins and duplication modes has yet to be reported. In rice (Oryza sativa), we computed a body methylation level (proportion of methylated CpG within coding regions) for each gene in five tissues. Body methylation levels follow a bimodal distribution, but show distinct patterns associated with transposable element-related genes; WGD, tandem, proximal and transposed duplicates; and singleton genes. For pairs of duplicated genes, divergence in body methylation levels increases with physical distance and synonymous (Ks) substitution rates, and WGDs show lower divergence than single-gene duplications of similar Ks levels. Intermediate body methylation tends to be associated with high levels of gene expression, whereas heavy body methylation is associated with lower levels of gene expression. The biological trends revealed here are consistent across five rice tissues, indicating that genes of different origins and duplication modes have distinct body methylation patterns, and body methylation has a heterogeneous relationship with gene expression and may be related to survivorship of duplicated genes. Introduction Gene duplication is a primary mechanism for the evolution of novelty and complexity in higher organisms (Ohno, 1970; Flagel & Wendel, 2009; Innan & Kondrashov, 2010). It is now known that genes may be duplicated by various modes, generally referred to as large-scale and small-scale duplications (Maere et al., 2005; Casneuf et al., 2006; Ganko et al., 2007; Freeling, 2009; Wang et al., 2012). The most frequent consequence of gene duplication is reversion to single-copy (singleton) status (Freeling & Thomas, 2006; Freeling, 2009); however, genes retained in duplicate offer the potential for the evolution of novelty (Ohno, 1970; Flagel & Wendel, 2009; Innan & Kondrashov, 2010). Thus, the study of mechanisms for gene retention and evolution in view of different gene duplication modes is very important (Wang et al., 2012). Oryza sativa (rice) is a good model to elucidate the genetic mechanisms and evolutionary features of different gene duplication modes (Wang et al., 2007, 2011; Li et al., 2009). Rice has experienced at least two whole-genome duplications (WGDs), one shared with most if not all cereals (q), and another more ancient event (r) (Paterson et al., 2004; Tang et al., 2010). In angiosperm species, most duplicated chromosomal segments are thought to arise from WGDs (Tang et al., 2008a,b). Smallscale gene duplications, often referred to as single-gene duplications, are also widespread in rice (Wang et al., 2007, 2011; Li et al., 2009). According to the physical distance between duplicates, single-gene duplications can be further classified into local and transposed gene duplications (Ganko et al., 2007; Wang et al., 2011, 2012). Local duplications may occur as tandem duplications (i.e. duplicated genes are consecutive in the genome), which may be caused by illegitimate chromosomal recombination (Freeling, 2009), or proximal duplications (i.e. separated by one or more genes), which may be caused by localized transposon activities (Zhao et al., 1998; Wang et al., 2011, 2012). Transposable element (TE)-related genes comprise a significant portion of rice protein-coding genes (Yuan et al., 2005; Jiao & Deng, 2007). TE-related genes have normal gene structures with coding capacity and transcriptional activity, but share significant sequence similarity with known TEs (Jiao & Deng, 2007). Transposed duplications that create two gene copies far 274

2 New Phytologist Research 275 away from each other are widespread in plants (Freeling et al., 2008; Freeling, 2009; Woodhouse et al., 2010, 2011; Wang et al., 2011, 2012), suggesting that many non-te-related genes are also mobile, via either DNA- or RNA-mediated transposition (Cusack & Wolfe, 2007). Transposed duplicates may also occur by intrachromosomal recombination (Woodhouse et al., 2011). Divergence between duplicated genes increases with time, but the rate/extent of divergence is affected by gene duplication modes (Casneuf et al., 2006; Arabidopsis Interactome Mapping Consortium, 2011; Wang et al., 2011). Generally, WGD duplicates are less divergent than other duplicates (Casneuf et al., 2006; Ganko et al., 2007; Li et al., 2009; Wang et al., 2011). Moreover, singletons show higher interspecies conservation than duplicates based on cross-species comparison of genomic and expression data (Ha et al., 2009; Wang et al., 2011). Indeed, the distinct evolutionary effects of gene duplication modes may, in turn, affect the rates of gene retention, depending on functional category-specific selection pressures on neo-functionalization, functional buffering or high expression (Freeling, 2009; Innan & Kondrashov, 2010; Wang et al., 2012). Under-explored and controversial in the current literature are the roles of epigenetic marks in gene duplication, evolution and retention. DNA methylation is one of the most important epigenetic marks, and high-resolution whole-genome DNA methylation maps based on bisulfite sequencing have been made for rice (Feng et al., 2010; Zemach et al., 2010a,b). Previous analyses of whole-genome DNA methylation data have suggested that rice DNA methylation occurs predominantly at cytosine followed by guanine, that is, CpG dinucleotides (Feng et al., 2010; Zemach et al., 2010b). Gene body methylation (DNA methylation of coding regions) is conserved across eukaryotic lineages (Lee et al., 2010; Su et al., 2011). Although it is broadly accepted that promoter methylation is generally associated with the repression of plant gene expression (Zhang et al., 2006; Su et al., 2011), the functional roles of gene body methylation are controversial (Lee et al., 2010; Su et al., 2011). To date, gene body methylation has been suggested to enhance accurate splicing of primary transcripts (Lorincz et al., 2004; Kolasinska-Zwierz et al., 2009; Schwartz et al., 2009; Luco et al., 2010) and/or prevent leaky expression from intragenic cryptic promoters (Zilberman et al., 2007; Maunakea et al., 2010). In Arabidopsis and rice, association of gene body methylation with active transcription has been proposed (Zhang et al., 2006; Zilberman et al., 2007; Zemach et al., 2010b; Takuno & Gaut, 2012). By contrast, several studies in rice have suggested that the major effect of body methylation on gene expression is repression (Li et al., 2008; He et al., 2010). From the point of view of evolution, body-methylated genes have been suggested to be functionally important and to evolve slowly (Sarda et al., 2012; Takuno & Gaut, 2012). However, the interplay between gene body methylation and gene duplication, as well as the evolution of duplicate genes, has been little explored. Study of the potential interplay between gene body methylation and gene origins and duplications may help us to understand the roles of epigenetic factors in shaping current genomes, as well as the mechanisms underlying gene duplications and evolution. In rice, we analyzed single-base resolution, whole-genome DNA methylation maps of five tissues (Zemach et al., 2010a,b). For each gene, we computed a body methylation level (proportion of methylated CpG dinucleotides within coding regions) in each tissue. We classified rice genes into different origins and duplication modes, including TE-related genes, singletons, and WGD, tandem, proximal and transposed duplicates, and compared the body methylation levels among different categories of genes. For duplicated genes, we examined divergence in body methylation levels and its relationship with coding sequence divergence. Furthermore, we studied the potential relationships between body methylation and duplicate gene retention. Finally, we investigated the complicated relationships between body methylation and gene expression levels. Materials and Methods Sequence sources The rice gene set was retrieved from the Rice Genome Annotation Project (TIGR5, The gene sets of outgroups, including Sorghum bicolor, Brachypodium and Zea mays, were retrieved from Phytozome ( phytozome.net/). For each gene, only the first transcript in the genome annotation (transcript name suffixed by.1 ) was used for analysis. Identification of genes of different origins Rice genes were first divided into TE-related and non-te-related genes, according to TIGR5. The non-te-related genes were further classified into WGD duplicates, singletons, tandem, proximal, transposed and dispersed duplicates. To this end, the population of potential gene duplications in rice was identified using BLASTP (Altschul et al., 1990) (TE-related genes were not considered for BLASTP). For each gene, only the top five nonself BLASTP matches that met a threshold of E < were considered as potential gene duplication relationships. The genes without any BLASTP hit were deemed singletons. WGD duplicates were obtained from a previous study (Tang et al., 2010). We then derived single-gene duplications by excluding pairs of WGD duplicates from the population of gene duplications. Tandem duplicates were adjacent homologs and proximal duplicates were not adjacent, but within 10 annotated genes of each other on the same chromosomes and without any paralog between them. The remaining single-gene duplications, that is, after deduction of the tandem and proximal duplications, were searched for transposed duplications. To accomplish this aim, genes at ancestral (i.e. interspecies collinear) chromosomal positions were discerned by aligning syntenic blocks within rice and between rice and its outgroups, including Sorghum bicolor, Brachypodium and Zea mays. For a pair of transposed duplicates, we required that one duplicate was at its ancestral locus and the other was at a nonancestral locus, named the parental duplicate and transposed duplicate, respectively. For a transposed duplicate, there may be multiple ancestral paralogs, and we regarded the ancestral paralog with highest sequence identity as its parental duplicate. The

3 276 Research New Phytologist remaining duplicates which do not belong to any of the WGD, tandem, proximal and transposed duplicates were simply denoted as dispersed duplicates. Rice whole-genome DNA methylation data Rice single-base resolution DNA methylation data of embryo, endosperm, leaf, root and shoot tissues, generated by bisulfite sequencing technology, were obtained from two previous studies (Zemach et al., 2010a,b). We used the processed data provided by the authors, available at the Gene Expression Omnibus database (accession numbers: GSM497260, GSM560562, GSM560563, GSM and GSM560565). In the processed data, the likelihood of methylation was shown for each CpG, CHG and CHH site, whose chromosomal position was annotated according to TIGR5. Only CpG methylation was considered in this study. The likelihood of CpG methylation showed a strong bimodal distribution, and we regarded a value of > 0.5 as methylation of CpG dinucleotides. Comparing the distributions of body methylation levels As body methylation levels tend to be bimodally distributed, it is not reasonable to compute a single mean and standard deviation of body methylation levels for a gene group. To compare the distributions of body methylation levels of different gene groups, we used both parametric and nonparametric tests: (1) parametric test: we counted the gene numbers associated with low methylation (body methylation level < 0.1), intermediate methylation (0.1 body methylation level 0.9), and high methylation (body methylation level > 0.9) for each gene group, and then compared the gene numbers with different extent of methylation between different gene groups using a v 2 test; and (2) nonparametric test: the comparison of the distributions of body methylation levels between two gene groups was modeled as testing whether one gene group had more outliers (highly body-methylated genes) than the other group. The Outlier-Sum statistic (Tibshirani & Hastie, 2007) was adopted. P values were assessed based on 10 4 permutations of the pooled body methylation levels of the two gene groups for comparison. Ks calculation Protein sequences of duplicated genes were aligned using Clustalw (Thompson et al., 1994) with default parameters. Then, the protein alignment was converted to a coding sequence alignment using the Bio::Align::Utilities module in the BioPerl package ( Ks was calculated using the methods of Nei & Gojobori (1986) and Yang & Nielsen (2000), via the Bio::Align::DNAStatistics and Bio::Tools::Run::Phylo:: PAML::Yn00 modules, respectively, in the BioPerl package. It should be noted that extremely high levels of sequence divergence between duplicated genes may cause the Bio::Align::DNAStatistics module to generate invalid Ks values, which were then ruled out from the related analysis. Following a previous study in rice (Tang et al., 2010), we excluded Ks values for gene pairs with average third-codon-position GC content (GC 3 ) > 75% from related statistical analyses because there are two distinct groups of genes with significantly different GC 3. Ks values > 3.0 were also excluded because of saturated substitutions at synonymous positions. Gene expression data Processed rice expression data over 508 tissues and physiological conditions, generated by the Affymetrix GeneChip Rice Genome Array, were obtained from previous studies (Ficklin et al., 2010; Wang et al., 2011). In the data, the numbers of columns that sampled embryo, endosperm, leaf, root and shoot were 3, 4, 50, 99 and 84, respectively. For some genes, there are multiple probe sets on the array to measure their expression. Inclusion or exclusion of suboptimal probe sets with suffix _s_at or _x_at, which were suspected of potential crosshybridization, has been shown previously to have only trivial effects (Wang et al., 2011). In this study, all types of probe sets were considered and, for a gene with multiple probe sets, the first probe set according to alphabetic sorting was used to represent its expression profile. Correlation analysis and smoothing spline regression In this study, correlations were measured by Spearman s correlation coefficients. Smoothing spline regression was performed via the smooth.spline function of R language. To avoid overfitting in smoothing spline regression, three degrees of freedom, including 2, 4 and 6, were tested. Results Gene origins in rice Like many other eukaryotic species, the rice genome has been shaped and dynamically reconstructed by multiple evolutionary forces and events, which render its genes to have different origins (International Rice Genome Sequencing Project, 2005). TErelated genes are classified on the basis of sharing significant sequence similarity with TEs (Jiao & Deng, 2007). Among non- TE-related genes, those present in only single copies were deemed to be singletons, whereas others were deemed to be duplicated. Duplicated genes were further classified in terms of duplication modes, with those at collinear positions of intraspecies syntenic blocks deemed to be WGD duplicates (Tang et al., 2010). All other duplicates were assumed to have occurred by single-gene duplications, further classified into tandem, proximal and dispersed, as described above. The mechanisms underlying dispersed duplications are very complicated (Wang et al., 2012). However, if one member of a pair of dispersed duplications was at its ancestral locus and the other was at a nonancestral locus, such gene duplications were deemed to be transposed (Wang et al., 2011, 2012). Summary statistics on rice gene origins are shown in Table 1, and the classification of duplicated genes is shown in Supporting Information Table S1.

4 New Phytologist Research 277 Table 1 Statistics on rice (Oryza sativa) genes of different origins and duplication modes Gene origin Number of gene pairs Non-TE-related N/A Singletons N/A Duplicates N/A WGD Tandem Proximal Transposed Dispersed N/A TE-related N/A N/A, not applicable; TE, transposable element. Number of distinct genes Body methylation levels show different distributions associated with gene origins and duplication modes To investigate the patterns of gene body methylation in view of different gene origins and duplication modes, we computed the body methylation level for each gene, defined as the proportion of methylated CpG dinucleotides relative to all CpG dinucleotides within its coding region, in embryo, endosperm, leaf, root and shoot. To test the consistency of body methylation levels across tissues, we visualized the body methylation levels of all genes between all pairs of tissues via scatter plots (Fig. S1). Although endosperm tissue shows higher variations than other tissues, body methylation levels are much more likely to be consistent (rather than different) across tissues, that is, points (genes) are densely distributed along the y = x diagonal line in the scatter plots. This analysis indicates that it is feasible to study the evolutionary characteristics of body methylation for large groups of genes with the acknowledgement of the existence of tissuespecific body methylation for specific genes. A recent study has suggested that gene bodies cluster into two groups corresponding to high and low levels of DNA methylation, respectively, in honeybee, silkworm, sea squirt and sea anemone (Sarda et al., 2012). We plotted the distribution of body methylation levels for all rice genes (Fig. 1a), finding a clear bimodal distribution peaking at 0 or 1, suggesting that gene bodies tend to be either highly methylated or little methylated in rice. We found that different gene origins differ in the distributions of body methylation levels. First, we compared the distributions of body methylation levels between TE-related and non- TE-related genes, and found that the two distributions were significantly different (P < , v 2 ; P < 10 4, Outlier- Sum statistic; see the Materials and Methods section) (Fig. 1b). Specifically, most TE-related genes are highly body-methylated (body methylation level > 0.9), consistent with previous studies (Zilberman et al., 2007; Li et al., 2008; Feng et al., 2010; He et al., 2010; Zemach et al., 2010b), whereas non-te-related genes are bimodally distributed, with more genes little bodymethylated (body methylation level < 0.1). As noted previously, TE-related genes exhibit much lower transcriptional activities than non-te-related genes (Jiao & Deng, 2007), suggesting that high levels of body methylation may be associated with reduced transcription, and conflicting with the hypothesis that body methylation has only minor, but positive, effects on the levels of gene expression (Zhang et al., 2006; Zilberman et al., 2007; Zemach et al., 2010b; Takuno & Gaut, 2012). We compared the distributions of body methylation levels between different origins within non-te-related genes. Singletons show a higher frequency of high body methylation than do duplicates (Fig. 1c; P < , v 2 ; P < 10 4, Outlier-Sum statistic; see the Materials and Methods section). Tandem, proximal and transposed duplicates show an obvious frequency peak of high body methylation (Fig. 1d), whereas WGD duplicates do not (P < , v 2 ; P < 10 4, Outlier-Sum statistic; see the Materials and Methods section). Moreover, the likelihood of a duplicated gene being highly body-methylated follows the tendency: transposed > proximal > tandem > WGD (P < , v 2 ; P < 10 4, Outlier-Sum statistic; see the Materials and Methods section). In partial summary, body methylation levels show different distributions associated with gene origins and duplication modes, suggesting that genes of different origins tend to have distinct epigenetic features. Divergence in body methylation levels between duplicated genes Genes duplicated by different modes differ in the extent of expression divergence and the rewiring of protein protein networks (De Smet & Van de Peer, 2012; Wang et al., 2012). Here, we examined whether duplicated genes of different modes also differ significantly in divergence in body methylation levels. Divergence in body methylation levels among gene pairs duplicated by different modes (Fig. 2a) showed the following trend: random gene pairs > transposed duplicates > proximal duplicates > tandem duplicates WGD duplicates (both an ANOVA model involving all duplication modes and Tukey s honestly significant difference (HSD) test between adjacent duplication modes were significant at a = 0.05), indicating that different modes of gene duplication tend to result in different extents of divergence in body methylation levels. The physical distance between single-gene duplicates (in terms of number of genes apart) also followed a trend: transposed duplicates > proximal duplicates > tandem duplicates. We hypothesized that there may be position effects that affect body methylation levels, for example, genes that are closer to each other on chromosomes tend to have more similar body methylation levels. To this end, we randomly selected gene pairs on the same chromosomes and computed the correlations between divergence in body methylation levels and physical distance. These correlations ranged from to (P < ), indicating that there exist weak position effects that affect body methylation levels for all rice genes. For single-gene duplicates, these correlations ranged from to (P < ), indicating that the position effects increase slightly for single-gene duplicate pairs relative to random gene pairs. At the same physical distance, single-gene duplicates diverge less in body methylation levels than

5 278 Research New Phytologist (a) (b) (c) (d) Fig. 1 Gene body methylation shows different patterns associated with gene origins and duplication modes. Each column represents one tissue. (a) Distribution of body methylation levels for all rice genes. (b) Comparison of distributions of body methylation levels between transposable element (TE)- related and non-te-related genes. (c) Comparison of distributions of body methylation levels between singleton and duplicate genes. (d) Comparisonof distributions of body methylation levels among whole-genome duplication (WGD), tandem, proximal and transposed duplicates. do random gene pairs (Fig. 2b), suggesting that body methylation patterns are either copied or recapitulated following gene duplication. Relationship between body methylation patterns and Ks for pairs of duplicated genes To understand how gene body methylation evolves following gene duplication, it may be helpful to relate patterns of body methylation of duplicated genes to the divergence of their coding sequence. Synonymous (Ks) substitution rates largely reflect the neutral mutation rates of coding sequences, suggested to increase approximately linearly with time for relatively low levels of sequence divergence (Li, 1997). We first related divergence in body methylation levels between duplicated genes to Ks using linear regression (Fig. 3a). Positive correlations were found for all duplication modes (0.113 r 0.175, P < ). For single-gene duplicates, these correlations ranged from to (P ). However, as we have shown that, for single-gene duplicates, there is a weak correlation between divergence in body methylation levels and physical distance, the position effects could be a nuisance factor for the correlation between divergence in body methylation levels and Ks. To remove the effect of physical distance on these correlations for single-gene duplicates, we computed the partial correlations between divergence in body methylation levels and Ks. These partial correlations ranged from to (P ), declining by from their corresponding correlations, indicating that physical distance has a very weak effect on the correlation between divergence in body methylation levels and Ks. Thus, divergence in body methylation levels between duplicated genes tends to increase with Ks. Moreover, at similar

6 New Phytologist Research 279 (a) (b) Fig. 2 Divergence in body methylation levels between duplicated genes. Each column represents one tissue. (a) Comparison of divergence in body methylation levels among different modes of gene duplication. Whiskers correspond to the minimum and maximum values in the data. (b) Linear regressions between divergence in body methylation levels and physical distance for random gene pairs and single-gene duplicate pairs. Ks levels, WGDs tend to have smaller divergence in body methylation levels between duplicates than do tandem, proximal or transposed duplications. The different extent of divergence in body methylation levels between gene duplication modes may be explained by the hypothesis that WGDs generate duplicated chromosomal segments in which collinear duplicates are more likely to have similar chromatin environments, whereas singlegene, especially transposed, duplications re-locate to new chromosomal positions which often have different chromatin environments. Next, we related the body methylation levels of duplicated genes to Ks using linear regression (Fig. 3b). The direction of the correlations differs among different modes of gene duplication: Body methylation of WGD duplicates is positively correlated with Ks (0.051 r 0.084, P < 0.05), whereas body methylation of single-gene duplicates decreases with Ks ( r 0.082, P < ). Some duplicated genes are highly methylated, particularly those generated by single-gene duplications. It is well known that single-gene duplicates have a shorter half-life than WGD-generated duplicates (Lynch & Conery, 2000). Different rates of nonrandom gene loss shortly after WGD and single-gene duplication may contribute to the contrasting directions of the correlations between body methylation levels and Ks. In the first few million years following single-gene duplication, many duplicates become nonfunctionalized and are lost (Innan & Kondrashov, 2010). Biases among these genes may mitigate the long-term tendency towards increased body methylation, as in WGD duplicates, for example if highly bodymethylated duplicates are preferentially lost. Thus, there could be links between body methylation patterns and the probability of long-term survival of duplicated genes. Relationship between gene body methylation and gene expression The observation that TE-related genes are highly body-methylated, but little expressed, appears to conflict with the observation that body methylation has a positive effect on the levels of gene expression (Zhang et al., 2006; Zilberman et al., 2007; Zemach et al., 2010b; Takuno & Gaut, 2012). However, these two observations might be reconciled if gene body methylation has heterogeneous effects on gene expression, that is, gene body methylation affects gene expression in different ways under different conditions. We plotted the regression lines between gene expression levels and body methylation levels for all non- TE-related genes based on each tissue, using smooth splines with different degrees of freedom (Fig. 4); this showed that intermediate body methylation tends to be associated with higher gene expression levels than both low and high body methylation. To test this observation statistically, we computed the correlations between body methylation levels and expression levels for the genes with body methylation levels of < 0.5 and 0.5. These correlations ranged from to (P < ) when the body methylation level was < 0.5, and from to (P ) when the body methylation level was 0.5. This result suggests that intermediate body methylation may indeed have positive effects on transcription, possibly through the enhancement of accurate splicing of primary transcripts, whereas high body methylation is more likely to repress gene expression, which may lead to pseudofunctionalization or gene losses. We related gene expression to variances of body methylation levels across tissues. Based on Fig. S1, we inferred that TE-related

7 280 Research New Phytologist (a) (b) Fig. 3 Relationships between patterns of body methylation and Ks for duplicated genes. Each column represents one tissue. (a) Linear regressions between divergence in body methylation levels and Ks for different modes of gene duplication. (b) Linear regressions between body methylation levels and Ks for different modes of gene duplication. genes tend to have more uniform body methylation levels (closer to the y = x diagonal line) than do non-te-related genes, which was then proven statistically by two-sample t-test for variances of body methylation levels between TE-related and non-te-related genes (P < ). This observation indicates that the repressive TE-related body methylation tends to be uniform across tissues. For non-te-related genes, we found that there is a significant positive correlation (r = 0.173, P < ) between the average expression levels and variances of body methylation levels, indicating that non-te-related genes with high expression tend to vary in body methylation across tissues. Discussion We have related gene body methylation to gene origins and duplication modes in rice. Our results suggest that genes of different origins and duplication modes are associated with different patterns of gene body methylation, and highly body-methylated genes are preferentially lost following gene duplication. Although it is known that natural variations in DNA methylation exist among individuals of a species (Becker et al., 2011; Bell et al., 2011; Fraser et al., 2012) and that, within an individual, many cytosines may be differentially methylated among different tissues (Zemach et al., 2010a; Zhang et al., 2011; Vining et al., 2012) or

8 New Phytologist Research 281 Fig. 4 Gene body methylation has heterogeneous effects on gene expression. Smooth spline curves are fitted between gene expression levels and body methylation levels for all non-transposable element (TE)-related genes, based on different degrees of freedom. A body methylation level of 0.5 appears to be a point dividing the up- and down-regulation of gene expression levels. developmental stages (Alisch et al., 2012), or between normal and stress conditions (Chinnusamy & Zhu, 2009), our analyses of body methylation patterns based on five different tissues reveal highly consistent evolutionary trends. We summarized a body methylation level for each gene that may involve hundreds of CpG dinucleotides. Further, we compared body methylation levels among large groups of genes with each group consisting of several thousand genes. Thus, our computational procedure, through mitigation of the effect of dynamic changes of methylation status that may occur at some cytosine nucleotides, is reliable for large-scale evolutionary analyses. DNA methylation is an important epigenetic mark and can affect the nucleotide composition of DNA sequences. DNA methylation can trigger the spontaneous deamination of methylcytosine to thymine (Bird, 1980; Jones et al., 1987; Pfeifer, 2006), which makes DNA methylation levels and GC levels interdependent. The data of this study showed strong negative correlations ( r 0.458, P < ) between

9 282 Research New Phytologist body methylation levels and the GC content at the third codon position (GC 3 ) for rice genes. The evolution of DNA methylation patterns and DNA sequences can be intermingled, and the study of DNA methylation evolution may facilitate the understanding of mechanisms for DNA sequence evolution. In eukaryotic genomes, there are multiple epigenetic marks, including DNA methylation, histone modifications, nucleosome positioning and others, all of which may contribute to the regulation of gene expression (Henderson & Jacobsen, 2007). Among these epigenetic marks, DNA methylation has been studied extensively for its role in the regulation of gene expression. In rice, Li et al. (2008) showed an interplay between DNA methylation, histone methylation and gene expression, and that gene expression appeared to be repressed by DNA methylation, but to be rescued by the concurrence of DNA and H3K4 methylation. He et al. (2010) found a weak negative correlation between DNA methylation and transcript levels, and that TE-related genes are highly methylated and little transcribed. In Populus trichocarpa, gene body methylation is suggested to have a more repressive effect than promoter methylation on transcription (Vining et al., 2012). By contrast, in Arabidopsis, many studies have suggested that gene body methylation is associated with active transcription (Zhang et al., 2006; Zilberman et al., 2007; Takuno & Gaut, 2012). The conflicting conclusions on the direction of the relationship between body methylation and gene expression in previous studies may be because an overall correlation pattern has often been sought, overlooking the possibility that body methylation may have heterogeneous effects on gene expression. In conclusion, in rice, using the proportion of methylated CpG dinucleotides within coding regions to measure the level of gene body methylation, we found that body methylation levels follow a bimodal distribution peaking at 0 or 1, and display distinct patterns associated with different gene origins and duplication modes. For pairs of duplicated genes, divergence in body methylation levels increases with physical distance and Ks, and WGDs show lower divergence than single-gene duplications at similar Ks levels. Body methylation of WGD duplicates tends to increase with Ks, whereas the body methylation levels of single-gene duplicates decrease with Ks, indicating that highly body-methylated genes are preferentially lost following gene duplication. Moderate body methylation tends to enhance gene expression, whereas light or heavy body methylation tends to repress gene expression. This study suggests that genes of different origins and duplication modes have distinct body methylation patterns, and body methylation evolves with DNA sequence evolution, has heterogeneous effects on gene expression and might be related to survivorship of duplicated genes. Acknowledgements We thank Barry Marler for IT support, Xinyu Liu for statistical consulting and Haibao Tang for providing python scripts. A.H.P. appreciates funding from the National Science Foundation (NSF: DBI , MCB , MCB ). This study was supported in part by resources and technical expertise from the Georgia Advanced Computing Resource Center, a partnership between the Office of the Vice President for Research and the Office of the Chief Information Officer. References Alisch RS, Barwick BG, Chopra P, Myrick LK, Satten GA, Conneely KN, Warren ST Age-associated DNA methylation in pediatric populations. Genome Research 22: Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ Basic local alignment search tool. Journal of Molecular Biology 215: Arabidopsis Interactome Mapping Consortium Evidence for network evolution in an Arabidopsis interactome map. Science 333: Becker C, Hagmann J, Muller J, Koenig D, Stegle O, Borgwardt K, Weigel D Spontaneous epigenetic variation in the Arabidopsis thaliana methylome. Nature 480: Bell JT, Pai AA, Pickrell JK, Gaffney DJ, Pique-Regi R, Degner JF, Gilad Y, Pritchard JK DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines. Genome Biology 12: R10. Bird AP DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Research 8: Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y Nonrandom divergence of gene expression following gene and genome duplications in the flowering plant Arabidopsis thaliana. Genome Biology 7: R13. Chinnusamy V, Zhu JK Epigenetic regulation of stress responses in plants. Current Opinion in Plant Biology 12: Cusack BP, Wolfe KH Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates. Molecular Biology and Evolution 24: De Smet R, Van de Peer Y Redundancy and rewiring of genetic networks following genome-wide duplication events. Current Opinion in Plant Biology 15: Feng S, Cokus SJ, Zhang X, Chen PY, Bostick M, Goll MG, Hetzel J, Jain J, Strauss SH, Halpern ME et al Conservation and divergence of methylation patterning in plants and animals. Proceedings of the National Academy of Sciences, USA 107: Ficklin SP, Luo F, Feltus FA The association of multiple interacting genes with specific phenotypes in rice using gene coexpression networks. Plant Physiology 154: Flagel LE, Wendel JF Gene duplication and evolutionary novelty in plants. New Phytologist 183: Fraser HB, Lam LL, Neumann SM, Kobor MS Population-specificity of human DNA methylation. Genome Biology 13: R8. Freeling M Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annual Review of Plant Biology 60: Freeling M, Lyons E, Pedersen B, Alam M, Ming R, Lisch D Many or most genes in Arabidopsis transposed after the origin of the order Brassicales. Genome Research 18: Freeling M, Thomas BC Gene-balanced duplications, like tetraploidy, provide predictable drive to increase morphological complexity. Genome Research 16: Ganko EW, Meyers BC, Vision TJ Divergence in expression between duplicated genes in Arabidopsis. Molecular Biology and Evolution 24: Ha M, Kim ED, Chen ZJ Duplicate genes increase expression diversity in closely related species and allopolyploids. Proceedings of the National Academy of Sciences, USA 106: He G, Zhu X, Elling AA, Chen L, Wang X, Guo L, Liang M, He H, Zhang H, Chen F et al Global epigenetic and transcriptional trends among two rice subspecies and their reciprocal hybrids. Plant Cell 22: Henderson IR, Jacobsen SE Epigenetic inheritance in plants. Nature 447: Innan H, Kondrashov F The evolution of gene duplications: classifying and distinguishing between models. Nature Reviews Genetics 11: International Rice Genome Sequencing Project The map-based sequence of the rice genome. Nature 436:

10 New Phytologist Research 283 Jiao Y, Deng XW A genome-wide transcriptional activity survey of rice transposable element-related genes. Genome Biology 8: R28. Jones M, Wagner R, Radman M Mismatch repair of deaminated 5- methyl-cytosine. Journal of Molecular Biology 194: Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J Differential chromatin marking of introns and expressed exons by H3K36me3. Nature Genetics 41: Lee TF, Zhai J, Meyers BC Conservation and divergence in eukaryotic DNA methylation. Proceedings of the National Academy of Sciences, USA 107: Li WH Molecular evolution. Sunderland, MA, USA: Sinauer Associates. Li X, Wang X, He K, Ma Y, Su N, He H, Stolc V, Tongprasit W, Jin W, Jiang J et al High-resolution mapping of epigenetic modifications of the rice genome uncovers interplay between DNA methylation, histone methylation, and gene expression. Plant Cell 20: Li Z, Zhang H, Ge S, Gu X, Gao G, Luo J Expression pattern divergence of duplicated genes in rice. BMC Bioinformatics 10(Suppl 6): S8. Lorincz MC, Dickerson DR, Schmitt M, Groudine M Intragenic DNA methylation alters chromatin structure and elongation efficiency in mammalian cells. Nature Structural & Molecular Biology 11: Luco RF, Pan Q, Tominaga K, Blencowe BJ, Pereira-Smith OM, Misteli T Regulation of alternative splicing by histone modifications. Science 327: Lynch M, Conery JS The evolutionary fate and consequences of duplicate genes. Science 290: Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M, Van de Peer Y Modeling gene and genome duplications in eukaryotes. Proceedings of the National Academy of Sciences, USA 102: Maunakea AK, Nagarajan RP, Bilenky M, Ballinger TJ, D Souza C, Fouse SD, Johnson BE, Hong C, Nielsen C, Zhao Y et al Conserved role of intragenic DNA methylation in regulating alternative promoters. Nature 466: Nei M, Gojobori T Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution 3: Ohno S Evolution by gene duplication. New York, NY, USA: Springer. Paterson AH, Bowers JE, Chapman BA Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proceedings of the National Academy of Sciences, USA 101: Pfeifer GP Mutagenesis at methylated CpG sequences. DNA Methylation: Basic Mechanisms 301: Sarda S, Zeng J, Hunt BG, Yi SV The evolution of invertebrate gene body methylation. Molecular Biology and Evolution 29: Schwartz S, Meshorer E, Ast G Chromatin organization marks exon intron structure. Nature Structural & Molecular Biology 16: Su Z, Han L, Zhao Z Conservation and divergence of DNA methylation in eukaryotes: new insights from single base-resolution DNA methylomes. Epigenetics 6: Takuno S, Gaut BS Body-methylated genes in Arabidopsis thaliana are functionally important and evolve slowly. Molecular Biology and Evolution 29: Tang H, Bowers JE, Wang X, Ming R, Alam M, Paterson AH. 2008a. Synteny and collinearity in plant genomes. Science 320: Tang H, Bowers JE, Wang X, Paterson AH Angiosperm genome comparisons reveal early polyploidy in the monocot lineage. Proceedings of the National Academy of Sciences, USA 107: Tang H, Wang X, Bowers JE, Ming R, Alam M, Paterson AH. 2008b. Unraveling ancient hexaploidy through multiply-aligned angiosperm gene maps. Genome Research 18: Thompson JD, Higgins DG, Gibson TJ CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22: Tibshirani R, Hastie T Outlier sums for differential gene expression analysis. Biostatistics 8:2 8. Vining KJ, Pomraning KR, Wilhelm LJ, Priest HD, Pellegrini M, Mockler TC, Freitag M, Strauss SH Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression. BMC Genomics 13: 27. Wang X, Tang H, Bowers JE, Feltus FA, Paterson AH Extensive concerted evolution of rice paralogs and the road to regaining independence. Genetics 177: Wang Y, Wang X, Paterson AH Genome and gene duplications and gene expression divergence: a view from plants. Annals of the New York Academy of Sciences 1256:1 14. Wang Y, Wang X, Tang H, Tan X, Ficklin SP, Feltus FA, Paterson AH Modes of gene duplication contribute differently to genetic novelty and redundancy, but show parallels across divergent angiosperms. PLoS ONE 6: e Woodhouse MR, Pedersen B, Freeling M Transposed genes in Arabidopsis are often associated with flanking repeats. PLoS Genetics 6: e Woodhouse MR, Tang H, Freeling M Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the rosids. Plant Cell 23: Yang Z, Nielsen R Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Molecular Biology and Evolution 17: Yuan Q, Ouyang S, Wang A, Zhu W, Maiti R, Lin H, Hamilton J, Haas B, Sultana R, Cheung F et al The institute for genomic research Osa1 rice genome annotation database. Plant Physiology 138: Zemach A, Kim MY, Silva P, Rodrigues JA, Dotson B, Brooks MD, Zilberman D. 2010a. Local DNA hypomethylation activates genes in rice endosperm. Proceedings of the National Academy of Sciences, USA 107: Zemach A, McDaniel IE, Silva P, Zilberman D. 2010b. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science 328: Zhang M, Xu C, von Wettstein D, Liu B Tissue-specific differences in cytosine methylation and their association with differential gene expression in sorghum. Plant Physiology 156: Zhang X, Yazaki J, Sundaresan A, Cokus S, Chan SW, Chen H, Henderson IR, Shinn P, Pellegrini M, Jacobsen SE et al Genome-wide high-resolution mapping and functional analysis of DNA methylation in Arabidopsis. Cell 126: Zhao XP, Si Y, Hanson RE, Crane CF, Price HJ, Stelly DM, Wendel JF, Paterson AH Dispersed repetitive DNA has spread to new genomes since polyploid formation in cotton. Genome Research 8: Zilberman D, Gehring M, Tran RK, Ballinger T, Henikoff S Genomewide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nature Genetics 39: Supporting Information Additional supporting information may be found in the online version of this article. Fig. S1 Comparison of body methylation levels of all genes between all pairs of tissues. Table S1 Classification of rice duplicated genes Please note: Wiley-Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.

How to detect paleoploidy?

How to detect paleoploidy? Genome duplications (polyploidy) / ancient genome duplications (paleopolyploidy) How to detect paleoploidy? e.g. a diploid cell undergoes failed meiosis, producing diploid gametes, which selffertilize

More information

Small RNA in rice genome

Small RNA in rice genome Vol. 45 No. 5 SCIENCE IN CHINA (Series C) October 2002 Small RNA in rice genome WANG Kai ( 1, ZHU Xiaopeng ( 2, ZHONG Lan ( 1,3 & CHEN Runsheng ( 1,2 1. Beijing Genomics Institute/Center of Genomics and

More information

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc Supplemental Data. Perea-Resa et al. Plant Cell. (22)..5/tpc.2.3697 Sm Sm2 Supplemental Figure. Sequence alignment of Arabidopsis LSM proteins. Alignment of the eleven Arabidopsis LSM proteins. Sm and

More information

Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes

Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes Jun Wang, Nicholas C. Marowsky, Chuanzhu Fan* Department of Biological Sciences, Wayne State University, Detroit, Michigan,

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

Chapter 15 Active Reading Guide Regulation of Gene Expression

Chapter 15 Active Reading Guide Regulation of Gene Expression Name: AP Biology Mr. Croft Chapter 15 Active Reading Guide Regulation of Gene Expression The overview for Chapter 15 introduces the idea that while all cells of an organism have all genes in the genome,

More information

Evolutionary Genomics and Proteomics

Evolutionary Genomics and Proteomics Evolutionary Genomics and Proteomics Mark Pagel Andrew Pomiankowski Editors Sinauer Associates, Inc. Publishers Sunderland, Massachusetts 01375 Table of Contents Preface xiii Contributors xv CHAPTER 1

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Supplementary text for the section Interactions conserved across species: can one select the conserved interactions?

Supplementary text for the section Interactions conserved across species: can one select the conserved interactions? 1 Supporting Information: What Evidence is There for the Homology of Protein-Protein Interactions? Anna C. F. Lewis, Nick S. Jones, Mason A. Porter, Charlotte M. Deane Supplementary text for the section

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Ze Zhang,* Z. W. Luo,* Hirohisa Kishino,à and Mike J. Kearsey *School of Biosciences, University of Birmingham,

More information

Research Article Expression Divergence of Tandemly Arrayed Genes in Human and Mouse

Research Article Expression Divergence of Tandemly Arrayed Genes in Human and Mouse Hindawi Publishing Corporation Comparative and Functional Genomics Volume 27, Article ID 6964, 8 pages doi:1.1155/27/6964 Research Article Expression Divergence of Tandemly Arrayed Genes in Human and Mouse

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

Comparative genomics: Overview & Tools + MUMmer algorithm

Comparative genomics: Overview & Tools + MUMmer algorithm Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first

More information

Supplementary Information for: The genome of the extremophile crucifer Thellungiella parvula

Supplementary Information for: The genome of the extremophile crucifer Thellungiella parvula Supplementary Information for: The genome of the extremophile crucifer Thellungiella parvula Maheshi Dassanayake 1,9, Dong-Ha Oh 1,9, Jeffrey S. Haas 1,2, Alvaro Hernandez 3, Hyewon Hong 1,4, Shahjahan

More information

Sequence Alignment Techniques and Their Uses

Sequence Alignment Techniques and Their Uses Sequence Alignment Techniques and Their Uses Sarah Fiorentino Since rapid sequencing technology and whole genomes sequencing, the amount of sequence information has grown exponentially. With all of this

More information

Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs generated by whole genome duplication and speciation

Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs generated by whole genome duplication and speciation Zhang et al. RESEARCH Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs generated by whole genome duplication and speciation Yue Zhang, Chunfang Zheng and David

More information

Lineage specific conserved noncoding sequences in plants

Lineage specific conserved noncoding sequences in plants Lineage specific conserved noncoding sequences in plants Nilmini Hettiarachchi Department of Genetics, SOKENDAI National Institute of Genetics, Mishima, Japan 20 th June 2014 Conserved Noncoding Sequences

More information

Molecular evolution - Part 1. Pawan Dhar BII

Molecular evolution - Part 1. Pawan Dhar BII Molecular evolution - Part 1 Pawan Dhar BII Theodosius Dobzhansky Nothing in biology makes sense except in the light of evolution Age of life on earth: 3.85 billion years Formation of planet: 4.5 billion

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11 The Eukaryotic Genome and Its Expression Lecture Series 11 The Eukaryotic Genome and Its Expression A. The Eukaryotic Genome B. Repetitive Sequences (rem: teleomeres) C. The Structures of Protein-Coding

More information

FUNDAMENTALS OF MOLECULAR EVOLUTION

FUNDAMENTALS OF MOLECULAR EVOLUTION FUNDAMENTALS OF MOLECULAR EVOLUTION Second Edition Dan Graur TELAVIV UNIVERSITY Wen-Hsiung Li UNIVERSITY OF CHICAGO SINAUER ASSOCIATES, INC., Publishers Sunderland, Massachusetts Contents Preface xiii

More information

GENES ENCODING FLOWER- AND ROOT-SPECIFIC FUNCTIONS ARE MORE RESISTANT TO FRACTIONATION THAN GLOBALLY EXPRESSED GENES IN BRASSICA RAPA.

GENES ENCODING FLOWER- AND ROOT-SPECIFIC FUNCTIONS ARE MORE RESISTANT TO FRACTIONATION THAN GLOBALLY EXPRESSED GENES IN BRASSICA RAPA. GENES ENCODING FLOWER- AND ROOT-SPECIFIC FUNCTIONS ARE MORE RESISTANT TO FRACTIONATION THAN GLOBALLY EXPRESSED GENES IN BRASSICA RAPA A Thesis presented to the Faculty of California Polytechnic State University,

More information

Potato Genome Analysis

Potato Genome Analysis Potato Genome Analysis Xin Liu Deputy director BGI research 2016.1.21 WCRTC 2016 @ Nanning Reference genome construction???????????????????????????????????????? Sequencing HELL RIEND WELCOME BGI ZHEN LLOFRI

More information

Stage 1: Karyotype Stage 2: Gene content & order Step 3

Stage 1: Karyotype Stage 2: Gene content & order Step 3 Supplementary Figure Method used for ancestral genome reconstruction. MRCA (Most Recent Common Ancestor), AMK (Ancestral Monocot Karyotype), AEK (Ancestral Eudicot Karyotype), AGK (Ancestral Grass Karyotype)

More information

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Mathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007

Mathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007 -2 Transcript Alignment Assembly and Automated Gene Structure Improvements Using PASA-2 Mathangi Thiagarajan mathangi@jcvi.org Rice Genome Annotation Workshop May 23rd, 2007 About PASA PASA is an open

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

Genome-wide analysis of the MYB transcription factor superfamily in soybean

Genome-wide analysis of the MYB transcription factor superfamily in soybean Du et al. BMC Plant Biology 2012, 12:106 RESEARCH ARTICLE Open Access Genome-wide analysis of the MYB transcription factor superfamily in soybean Hai Du 1,2,3, Si-Si Yang 1,2, Zhe Liang 4, Bo-Run Feng

More information

Heterosis and inbreeding depression of epigenetic Arabidopsis hybrids

Heterosis and inbreeding depression of epigenetic Arabidopsis hybrids Heterosis and inbreeding depression of epigenetic Arabidopsis hybrids Plant growth conditions The soil was a 1:1 v/v mixture of loamy soil and organic compost. Initial soil water content was determined

More information

Bio 119 Bacterial Genomics 6/26/10

Bio 119 Bacterial Genomics 6/26/10 BACTERIAL GENOMICS Reading in BOM-12: Sec. 11.1 Genetic Map of the E. coli Chromosome p. 279 Sec. 13.2 Prokaryotic Genomes: Sizes and ORF Contents p. 344 Sec. 13.3 Prokaryotic Genomes: Bioinformatic Analysis

More information

Introduction to Molecular and Cell Biology

Introduction to Molecular and Cell Biology Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the molecular basis of disease? What

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology 2012 Univ. 1301 Aguilera Lecture Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the

More information

PLNT2530 (2018) Unit 5 Genomes: Organization and Comparisons

PLNT2530 (2018) Unit 5 Genomes: Organization and Comparisons PLNT2530 (2018) Unit 5 Genomes: Organization and Comparisons Unless otherwise cited or referenced, all content of this presenataion is licensed under the Creative Commons License Attribution Share-Alike

More information

Evolution and Epigenetics. Seminar: Social, Cognitive and Affective Neuroscience Speaker: Wolf-R. Brockhaus

Evolution and Epigenetics. Seminar: Social, Cognitive and Affective Neuroscience Speaker: Wolf-R. Brockhaus Evolution and Epigenetics Seminar: Social, Cognitive and Affective Neuroscience Speaker: Wolf-R. Brockhaus 1. History of evolutionary theory The history of evolutionary theory ~ 1800: Lamarck 1859: Darwin's

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants

More information

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants

More information

Genome-wide Identification of Lineage Specific Genes in Arabidopsis, Oryza and Populus

Genome-wide Identification of Lineage Specific Genes in Arabidopsis, Oryza and Populus Genome-wide Identification of Lineage Specific Genes in Arabidopsis, Oryza and Populus Xiaohan Yang Sara Jawdy Timothy Tschaplinski Gerald Tuskan Environmental Sciences Division Oak Ridge National Laboratory

More information

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E REVIEW SESSION Wednesday, September 15 5:30 PM SHANTZ 242 E Gene Regulation Gene Regulation Gene expression can be turned on, turned off, turned up or turned down! For example, as test time approaches,

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

In 1996, when the research plant community decided to determine

In 1996, when the research plant community decided to determine The hidden duplication past of Arabidopsis thaliana Cedric Simillion, Klaas Vandepoele, Marc C. E. Van Montagu, Marc Zabeau, and Yves Van de Peer* Department of Plant Systems Biology, Flanders Interuniversity

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Mixture models for analysing transcriptome and ChIP-chip data

Mixture models for analysing transcriptome and ChIP-chip data Mixture models for analysing transcriptome and ChIP-chip data Marie-Laure Martin-Magniette French National Institute for agricultural research (INRA) Unit of Applied Mathematics and Informatics at AgroParisTech,

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences

ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences Wentao Yang October 30, 2018 1 Introduction This vignette is intended to give a brief introduction of the ABSSeq

More information

TE content correlates positively with genome size

TE content correlates positively with genome size TE content correlates positively with genome size Mb 3000 Genomic DNA 2500 2000 1500 1000 TE DNA Protein-coding DNA 500 0 Feschotte & Pritham 2006 Transposable elements. Variation in gene numbers cannot

More information

An Introduction to Sequence Similarity ( Homology ) Searching

An Introduction to Sequence Similarity ( Homology ) Searching An Introduction to Sequence Similarity ( Homology ) Searching Gary D. Stormo 1 UNIT 3.1 1 Washington University, School of Medicine, St. Louis, Missouri ABSTRACT Homologous sequences usually have the same,

More information

Big Questions. Is polyploidy an evolutionary dead-end? If so, why are all plants the products of multiple polyploidization events?

Big Questions. Is polyploidy an evolutionary dead-end? If so, why are all plants the products of multiple polyploidization events? Plant of the Day Cyperus esculentus - Cyperaceae Chufa (tigernut) 8,000 kg/ha, 720 kcal/sq m per month Top Crop for kcal productivity! One of the world s worst weeds Big Questions Is polyploidy an evolutionary

More information

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University

Sequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics

Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics A. H. Paterson*, J. E. Bowers, and B. A. Chapman Plant Genome Mapping Laboratory, University

More information

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics. Evolutionary Genetics (for Encyclopedia of Biodiversity) Sergey Gavrilets Departments of Ecology and Evolutionary Biology and Mathematics, University of Tennessee, Knoxville, TN 37996-6 USA Evolutionary

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Impact of recurrent gene duplication on adaptation of plant genomes

Impact of recurrent gene duplication on adaptation of plant genomes Impact of recurrent gene duplication on adaptation of plant genomes Iris Fischer, Jacques Dainat, Vincent Ranwez, Sylvain Glémin, Jacques David, Jean-François Dufayard, Nathalie Chantret Plant Genomes

More information

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics Chapter 18 Lecture Concepts of Genetics Tenth Edition Developmental Genetics Chapter Contents 18.1 Differentiated States Develop from Coordinated Programs of Gene Expression 18.2 Evolutionary Conservation

More information

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17. Genetic Variation: The genetic substrate for natural selection What about organisms that do not have sexual reproduction? Horizontal Gene Transfer Dr. Carol E. Lee, University of Wisconsin In prokaryotes:

More information

Multiple Choice Review- Eukaryotic Gene Expression

Multiple Choice Review- Eukaryotic Gene Expression Multiple Choice Review- Eukaryotic Gene Expression 1. Which of the following is the Central Dogma of cell biology? a. DNA Nucleic Acid Protein Amino Acid b. Prokaryote Bacteria - Eukaryote c. Atom Molecule

More information

The Plant Cell, November. 2017, American Society of Plant Biologists. All rights reserved

The Plant Cell, November. 2017, American Society of Plant Biologists. All rights reserved The Genetics of Floral Development Teaching Guide Overview The development of flowers in angiosperm plants provided a critical evolutionary advantage, allowing more options for pollen dispersal and seed

More information

Single alignment: Substitution Matrix. 16 march 2017

Single alignment: Substitution Matrix. 16 march 2017 Single alignment: Substitution Matrix 16 march 2017 BLOSUM Matrix BLOSUM Matrix [2] (Blocks Amino Acid Substitution Matrices ) It is based on the amino acids substitutions observed in ~2000 conserved block

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Eukaryotic Gene Expression

Eukaryotic Gene Expression Eukaryotic Gene Expression Lectures 22-23 Several Features Distinguish Eukaryotic Processes From Mechanisms in Bacteria 123 Eukaryotic Gene Expression Several Features Distinguish Eukaryotic Processes

More information

Introduction to Bioinformatics

Introduction to Bioinformatics CSCI8980: Applied Machine Learning in Computational Biology Introduction to Bioinformatics Rui Kuang Department of Computer Science and Engineering University of Minnesota kuang@cs.umn.edu History of Bioinformatics

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Frequently Asked Questions (FAQs)

Frequently Asked Questions (FAQs) Frequently Asked Questions (FAQs) Q1. What is meant by Satellite and Repetitive DNA? Ans: Satellite and repetitive DNA generally refers to DNA whose base sequence is repeated many times throughout the

More information

Principles of Genetics

Principles of Genetics Principles of Genetics Snustad, D ISBN-13: 9780470903599 Table of Contents C H A P T E R 1 The Science of Genetics 1 An Invitation 2 Three Great Milestones in Genetics 2 DNA as the Genetic Material 6 Genetics

More information

Comparative Genomics. Chapter for Human Genetics - Principles and Approaches - 4 th Edition

Comparative Genomics. Chapter for Human Genetics - Principles and Approaches - 4 th Edition Chapter for Human Genetics - Principles and Approaches - 4 th Edition Editors: Friedrich Vogel, Arno Motulsky, Stylianos Antonarakis, and Michael Speicher Comparative Genomics Ross C. Hardison Affiliations:

More information

Comparison of Cost Functions in Sequence Alignment. Ryan Healey

Comparison of Cost Functions in Sequence Alignment. Ryan Healey Comparison of Cost Functions in Sequence Alignment Ryan Healey Use of Cost Functions Used to score and align sequences Mathematically model how sequences mutate and evolve. Evolution and mutation can be

More information

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9 Lecture 5 Alignment I. Introduction. For sequence data, the process of generating an alignment establishes positional homologies; that is, alignment provides the identification of homologous phylogenetic

More information

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre PhD defense Chromosomal rearrangements in mammalian genomes : characterising the breakpoints Claire Lemaitre Laboratoire de Biométrie et Biologie Évolutive Université Claude Bernard Lyon 1 6 novembre 2008

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2015 Week-4 BLAST Algorithm Continued Multiple Sequence Alignment Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and

More information

Impact of recurrent gene duplication on adaptation of plant genomes

Impact of recurrent gene duplication on adaptation of plant genomes Fischer et al. BMC Plant Biology 2014, 14:151 RESEARCH ARTICLE Open Access Impact of recurrent gene duplication on adaptation of plant genomes Iris Fischer 1,2*, Jacques Dainat 3,6, Vincent Ranwez 3, Sylvain

More information

Paleo-evolutionary plasticity of plant disease resistance genes

Paleo-evolutionary plasticity of plant disease resistance genes Zhang et al. BMC Genomics 2014, 15:187 RESEARCH ARTICLE Paleo-evolutionary plasticity of plant disease resistance genes Rongzhi Zhang 1,2, Florent Murat 1, Caroline Pont 1, Thierry Langin 1 and Jerome

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

In-Depth Assessment of Local Sequence Alignment

In-Depth Assessment of Local Sequence Alignment 2012 International Conference on Environment Science and Engieering IPCBEE vol.3 2(2012) (2012)IACSIT Press, Singapoore In-Depth Assessment of Local Sequence Alignment Atoosa Ghahremani and Mahmood A.

More information

- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes.

- mutations can occur at different levels from single nucleotide positions in DNA to entire genomes. February 8, 2005 Bio 107/207 Winter 2005 Lecture 11 Mutation and transposable elements - the term mutation has an interesting history. - as far back as the 17th century, it was used to describe any drastic

More information

Practical considerations of working with sequencing data

Practical considerations of working with sequencing data Practical considerations of working with sequencing data File Types Fastq ->aligner -> reference(genome) coordinates Coordinate files SAM/BAM most complete, contains all of the info in fastq and more!

More information

Quantitative Genetics & Evolutionary Genetics

Quantitative Genetics & Evolutionary Genetics Quantitative Genetics & Evolutionary Genetics (CHAPTER 24 & 26- Brooker Text) May 14, 2007 BIO 184 Dr. Tom Peavy Quantitative genetics (the study of traits that can be described numerically) is important

More information

Tools and Algorithms in Bioinformatics

Tools and Algorithms in Bioinformatics Tools and Algorithms in Bioinformatics GCBA815, Fall 2013 Week3: Blast Algorithm, theory and practice Babu Guda, Ph.D. Department of Genetics, Cell Biology & Anatomy Bioinformatics and Systems Biology

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

CONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018

CONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018 CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of

More information

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT 3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode

More information

Regulation of gene Expression in Prokaryotes & Eukaryotes

Regulation of gene Expression in Prokaryotes & Eukaryotes Regulation of gene Expression in Prokaryotes & Eukaryotes 1 The trp Operon Contains 5 genes coding for proteins (enzymes) required for the synthesis of the amino acid tryptophan. Also contains a promoter

More information

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

More information

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010 BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for

More information

Sequence Database Search Techniques I: Blast and PatternHunter tools

Sequence Database Search Techniques I: Blast and PatternHunter tools Sequence Database Search Techniques I: Blast and PatternHunter tools Zhang Louxin National University of Singapore Outline. Database search 2. BLAST (and filtration technique) 3. PatternHunter (empowered

More information

Prokaryotic Regulation

Prokaryotic Regulation Prokaryotic Regulation Control of transcription initiation can be: Positive control increases transcription when activators bind DNA Negative control reduces transcription when repressors bind to DNA regulatory

More information

Epigenetics and Flowering Any potentially stable and heritable change in gene expression that occurs without a change in DNA sequence

Epigenetics and Flowering Any potentially stable and heritable change in gene expression that occurs without a change in DNA sequence Epigenetics and Flowering Any potentially stable and heritable change in gene expression that occurs without a change in DNA sequence www.plantcell.org/cgi/doi/10.1105/tpc.110.tt0110 Epigenetics Usually

More information

Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication

Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication Brad A. Chapman*, John E. Bowers*, Frank A. Feltus*, and Andrew H. Paterson* *Plant

More information