Gene Conversion Drives Within Genic Sequences: Concerted Evolution of Ribosomal RNA Genes in Bacteria and Archaea

Size: px
Start display at page:

Download "Gene Conversion Drives Within Genic Sequences: Concerted Evolution of Ribosomal RNA Genes in Bacteria and Archaea"

Transcription

1 J Mol Evol (2000) 51: DOI: /s Springer-Verlag New York Inc Gene Conversion Drives Within Genic Sequences: Concerted Evolution of Ribosomal RNA Genes in Bacteria and Archaea Daiqing Liao Department of Microbiology and Infectious Diseases, Faculty of Medicine, Université de Sherbrooke, Sherbrooke, e Avenue Nord, Québec, J1H 5N4 Canada Received: 31 March 2000 / Accepted: 15 June 2000 Abstract. Multiple copies of a given ribosomal RNA gene family undergo concerted evolution such that sequences of all gene copies are virtually identical within a species although they diverge normally between species. In eukaryotes, gene conversion and unequal crossing over are the proposed mechanisms for concerted evolution of tandemly repeated sequences, whereas dispersed genes are homogenized by gene conversion. However, the homogenization mechanisms for multiple-copy, normally dispersed, prokaryotic rrna genes are not well understood. Here we compared the sequences of multiple paralogous rrna genes within a genome in 12 prokaryotic organisms that have multiple copies of the rrna genes. Within a genome, putative sequence conversion tracts were found throughout the entire length of each individual rrna genes and their immediate flanks. Individual conversion events convert only a short sequence tract, and the conversion partners can be any paralogous genes within the genome. Interestingly, the genic sequences undergo much slower divergence than their flanking sequences. Moreover, genomic context and operon organization do not affect rrna gene homogenization. Thus, gene conversion underlies concerted evolution of bacterial rrna genes, which normally occurs within genic sequences, and homogenization of flanking regions may result from co-conversion with the genic sequence. Key words: Concerted evolution Multigene families rrna Gene conversion Bacteria and Archaea Comparative genomics Introduction Multigene families exist in virtually all living organisms. Multigenes usually encode abundant molecules required for housekeeping functions. Genetic analysis reveals that genes of a multigene family exhibit extraordinary sequence homogeneity within a species, despite a normal level of divergence between orthologous genes in different species. These observations indicate that members of a multigene family evolve in a concerted fashion (Elder and Turner 1995; Liao 1999). Concerted evolution has been studied extensively for tandemly repeated multigene families in eukaryotes. In fact, the phenomenon of concerted evolution was first described for the tandemly arrayed ribosomal genes in Xenopus (Brown et al. 1972). Subsequent studies on rdna arrays in various eukaryotes have provided major insights into the mechanisms of concerted evolution (Arnheim et al. 1980; Coen et al. 1982; Elder and Turner 1995; Hillis et al. 1991; Schlötterer and Tautz 1994; Seperack et al. 1988). These studies, along with theoretical modeling, led to three major models that can account for sequence homogenization of tandemly repeated genes: unequal crossing over (Ohta 1976; Smith 1976; Szostak and Wu 1980), gene conversion (Dover 1982; Edelman and Gally 1970; Gangloff et al. 1996; Hillis et al. 1991; Nagylaki and Petes 1982; Ohta and Dover 1983), and gene amplification (Weiner and Denison 1983). The relative roles of these mechanisms in concerted evolution of large tandem arrays, such as rdna and alphoid satellites, are difficult to assess because the experimental analysis of these arrays

2 306 Fig. 1. Genomic distribution of the prokaryotic rrna genes. Four representative prokaryotic genomes are illustrated. The size (kb) of the entire genome of each microorganism is indicated in parenthesis following the name of each species. The nucleotide coordinates of the rrna genes were according to the RNA gene table in Entrez Genomes. The rrna gene cluster is depicted as an arrow, and the distance (kb) between each cluster is indicated. For comparison, the tandem-arrayed organization of the rrna genes (rdna array) in two eukaryotic species is also shown. Note that the figure is not drawn to scale. with essentially identical repeats has proven to be very challenging. Although the eukaryotic rrna genes are the best studied examples of concerted evolution, their high copy number (generally over 100; in some cases >1000) and the large copy size preclude exhaustive study (Elder and Turner 1995; Hillis and Dixon 1991). For example, the human rrna genes have large repeat unit ( 43 kb) that are arranged in large tandem array ( 100 repeats), and the 500 rrna genes are distributed among five nonsyntenic arrays (nucleolus organizers) (Gonzalez and Sylvester 1995; Sakai et al. 1995; Seperack et al. 1988). Even in lower eukaryotic organism yeast Saccharomyces cerevisiae, there are rdna repeats of 9.1 kb organized in a single tandem array (Warner 1989). Therefore, comprehensive analyses of multigene families with smaller gene size and fewer number of repeats would be desirable to gain a clearer understanding of the mode and extent of concerted evolution. The prokaryotic rrna gene families may represent such a simpler system. There are multicopies (generally <10) of the rrna genes in most species of prokaryotes, probably because of high metabolic demand for protein synthesis in a growing cell, as multigene families are generally rare in prokaryotes (Bacteria and Archaea). The genes for the three rrna molecules (23S, 16S, and 5S) found in the ribosome are generally linked together and cotranscribed in a single operon in prokaryotes. The typical length of these three rrna genes in prokaryotic organisms is 2900 bp (23S), 1500 bp (16S), and 120 bp (5S), and their sizes as well as sequences are generally well conserved between different prokaryotic species. Compared to the tandem-arrayed organization of eukaryotic rrna genes, multiple rrna operons are generally dispersed throughout a prokaryotic genome (Fig. 1). Like some dispersed multigene families in several eukaryotic species (Amstutz et al. 1985; Morzycka-Wroblewska et al. 1985), multiple-copy rrna genes in a prokaryotic organism also undergo concerted evolution. Though there is no a priori reason to believe that different DNA recombination mechanisms might operate for concerted evolution in prokaryotes, a systematic analysis of multigene families within a simple genome might not only reveal the mechanism responsible for sequence homogenization but also provide insights into potential influence of genomic context on concerted evolution. Such systematic analyses have become possible with the advent of complete genome sequences of many species. Up to now, complete genome sequences are available for 25 prokaryotic species as well as several eukaryotic species, and the number is increasing rapidly. The availability of DNA sequences for all members of a given multigene family provides a unique opportunity to analyze the molecular mechanisms underlying concerted

3 307 evolution. In this study, I have analyzed the sequences of 23S, 16S, and 5S rrna genes and their immediate flanking sequences in 19 completely sequenced genomes. Striking sequence homogeneity of each individual rrna gene family within a species is found, and homogenization of rrna genes most probably results from frequent gene conversion. Most interesting, genic sequences undergo much slower divergence than their flanking nongenic sequences, suggesting that homogenization most frequently takes place within the genic regions. Thus, gene sequence may be the sole genetic element responsible for their homogenization. Flanking sequence and genomic context of the rrna genes play little if any role in concerted evolution of the prokaryotic rrna genes. Materials and Methods The DNA sequences for the 16S, 23S, and 5S rrna genes and their flanking sequences were obtained from Entrez Genomes in the Gen- Bank. The rrna gene sequences in these organisms were analyzed in this study: Aquifex aeolicus, Archaeoglobus fulgidus, Bacillus subtilis, Borrelia burgdorferi, Chlamydia trachomatis, Chlamydia pneumonia, Escherichia coli, Haemophilus influenzae, Helicobacter pylori strain 26695, Helicobacter pylori strain J99, Methanobacterium thermoautotrophicum, Methanococcus jannaschii, Mycobacterium tuberculosis, Mycoplasma genitalium, Mycoplasma pneumoniae, Pyrococcus horikoshii, Rickettsia prowazekii, Synechocystis PCC6803, and Treponema pallidum. BLAST search was used to define the gene sequences when annotation does not indicate specific boundaries of the rrna genes, or when the annotated coordinates contain obvious errors. Multiple sequence alignment was carried out using CLUSTAL W as implemented in MacVector 6.1 or at the Web site of the Baylor College of Medicine Search Launcher ( Phylogenetic analyses were performed using the DNAML, NEIGHBOR programs in the PHYLIP 3.5c package (Felsenstein 1993) or the PAUP program (Swofford 1993). Distances were computed using the DNA- DIST program in the PHYLIP 3.5c package (Felsenstein 1993) according to the Kimura two-parameter model with the transition/transversion ratio of 2.0. Insertions/deletions (indels) in multiple sequence alignment were discounted in parsimony analyses and distance calculation. The neighbor joining method was used to construct the phylogenetic tree shown in Fig. 2. Results Striking Within-Species Sequence Homogeneity of the rrna Genes in Bacteria and Archaea Thus far, 25 bacterial and archaeal genomes have been completely sequenced. Among them, 6 are Archaea and 19 are Bacteria. For the 19 genomes (both bacterial and archaeal) we surveyed in this study (see Table 1), most of them contain multiple-copy (paralogous) rrna genes, but 7 species carry only one copy of each rrna gene. The genes for 23S, 16S, and 5S are normally organized in a single operon in the order of 16S-23S-5S (Table 1) and are co-transcribed. Notably, however, that the 16S- 23S-5S operon structure has been broken in several species, where either 16S or 5S rrna genes are no longer linked to the operon, and thus become orphaned in other locales within the genome. Interestingly, no stand-alone 23S genes are found in these genomes. The rrna operons are generally dispersed throughout the genome (see Fig. 1), although tandem arrayed structure was found for several operons in B. subtilis and for the two 23S-5S operons in B. burgdorferi. The sequences of rrna genes are extremely homogenous within a genome. No sequence heterogeneity was detected for multiple-copy 23S, 16S, or 5S genes in A. aeolicus, C. trachomatis, H. influenzae, H. pylori (strain J99), M. thermoautotrophicum, and Synechocystis PCC6803. Five of these six species have only two rrna operons, whereas there are six operons in H. influenzae. Thus, the number of genes in a genome may not affect the degree of homogenization. Indeed, there are 10 and 7 rrna operons in B. subtilis and E. coli, respectively; again, the rrna genes in these two species display remarkable sequence homogeneity. Figure 2 shows a phylogenetic tree based on 16S rrna gene sequences from the 19 completely sequenced genomes listed in Table 1. The horizontal branch lengths reflect the degree of sequence divergence. To facilitate visual observation, all genes within a genome are shaded in the tree, and the average sequence divergence (distance) values within a species is also indicated next to the shaded areas. From the branch lengths within a species as well as the divergence values, we can easily see the extraordinary withinspecies homogeneity. From this tree, we can also easily appreciate that sequence homogeneity within a genome is much greater than that between genomes. For example, the average divergence among the seven E. coli 16S rrna gene is per site, whereas the average divergence between the 16S rrna genes in E. coli and its close relative H. influenzae is , which is 24 times greater than the within-species divergence in E. coli. Essentially identical patterns were observed for 23S and 5S rrna genes (data not shown). These results clearly show that the bacterial and archaeal rrna genes undergo concerted evolution. Intriguingly, obvious sequence heterogeneity was observed for the intergenic spacers between 16S and 23S genes in B. subtilis, E. coli, H. influenzae, and T. pallidum. This is mainly due to the presence or absence of trna genes or the presence of different trna genes in this intergenic region. For examples, in E. coli, trna Ala and trna Ile are present in the 16S 23S intergenic spacers in operons rrna, rrnd, and rrnh, while only trna Glu is found in the corresponding region in operons rrnb, rrnc, rrne, and rrng. In B. subtilis, the 16S 23S region contains trna Ile and trna Ala in both rrno and rrna operons, whereas the other eight operons have no trna genes in the corresponding region. In addition, other types of changes, including substitutions and insertions/ deletions (indels), are also frequent in the intergenic

4 308 Table 1. Organism Ribosomal RNA genes in Bacteria and Archaea Number of operons Operon organization Aquifex aeolicus 2 16S-(tRNA Ala )-23S-5S (both operons) Archaeoglobus fulgidus 1 16S-(tRNA Ala )-23S 5S (orphan) Bacillus subtilis 10 16S-(tRNA Ile -trna Ala )-23S-5S (rrna and rrno) 16S-23S-5S (rrnb, rrnd, rrne, rrnh, rrni, rrnj, rrng and rrnw). rrnj-rrnw, and rrni-rrnh-rrng form two separate tandem clusters Borrelia burgdorferi a 2 16S (orphan) b...23s-5s-23s-5s (tandem duplication of 23S-5S cluster) Chlamydia trachomatis 2 16S-23S-5S (both operons) Chlamydia pneumoniae 1 16S-23S-5S Escherichia coli 7 16S-(tRNA Ile -trna Ala )-23S-5S (rrna and rrnh) 16S-(tRNA Glu )-23S-5S (rrnb, rrnc, rrne, and rrng) 16S-(tRNA Ile -trna Ala )-23S-5S-(tRNA Thr )-5S (rrff) (rrnd) Haemophilus influenzae 6 16S-(tRNA Ile -trna Ala )-23S-5S (rrna, rrnc, and rrnd) 16S-(tRNA Glu )-23S-5S (rrnb, rrne, and rrnf) Helicobacter pylori c strain S-5S (both operons) 16S (orphan, two copies) Helicobacter pylori strain J S-5S (both operons) 16S (orphan, two copies) Methanobacterium thermoautotrophicum 2 16S-(tRNA Ala )-23S-5S (both operons) Methanococcus jannaschii 2 16S-(tRNA Ala )-23S (rrna) 16S-(tRNA Ala )-23S-5S (rrnb) 5S (orphan) Mycobacterium tuberculosis 1 16S-23S-5S Mycoplasma genitalium 1 16S-23S-5S Mycoplasma pneumoniae 1 16S-23S-5S Pyrococcus horikoshii 1 16S-(tRNA Ala )-23S 5S (orphan) Rickettsia prowazekii 1 23S-5S 16S (orphan) Synechocystis PCC S-(tRNA Ile )-23S-5S (both operons) Treponema pallidum 2 16S-(tRNA Ile )-23S-5S (rrna) 16S-(tRNA Ala )-23S-5S (rrnb) a Linear chromosome. b The 16S gene is located 3 kb 5 to the first 23S gene. c A truncated orphan 5S is also present in the genome. GenBank accession numbers for these complete genomes are: AE (A. aeolicus), AE (A. fulgidus), AL (B. subtilis), AE (B. burgdorferi), AE (C. trachomatis), AE (C. pneumoniae), U00096 (E. coli), L42023 (H. influenzae), AE (H. pylori 26695). AE (H. pylori J99), AE (M. thermoautotrophicum), L77117 (M. jannaschii), AL (M. tuberculosis), L43967 (M. genitalium), U00089 (M. pneumoniae), AP to AP (P. horikosshii), AJ (R. prowazekii), AB (Synechocystis PCC6803), AE (T. pallidum). spacers. The remarkable contrast of homogeneity in the genic sequences to heterogeneity in the intergenic spacers implies that concerted evolution does not reflect gross replacement of one operon with another; rather it is a gradual, region-by-region homogenization process (see below). Patchy Gene Conversion Is Responsible for the Homogenization of the rrna Genes Occasional sequence heterogeneity is found among the multi-copy rrna genes within the genome in several species; this may prove to be informative to decipher the molecular mechanisms responsible for rrna gene homogenization. Indeed, numerous informative sites are present in the rrna genes in E. coli, B. subtilis, and several other organisms. Figure 3 illustrates the multiple sequence alignment within a few segments of E. coli 16S genes (rrs), 23S genes (rrl), and 16S 23S intergenic spacers. Several sequence conversion events have evidently occurred between rrsc, rrsd, and rrsg in the illustrated regions (Fig. 3A). Because these sequence tracts share multiple substitutions, it must result from sequence conversion and must not be due to multiple independent mutations (convergent evolution). Likewise, apparent conversion events were observed in several regions of 23S rrna genes, and the 16S 23S intergenic spacers, as depicted in the panels B and C of Fig. 3, as well as in the rrna genes and intergenic spacers in B. subtilis (data not shown). Individual conversion tracts appear to be short, apparently <500 bp (short conversion tracts were also observed in other organisms, see Discussion). For example, although the two segments shown in Fig. 3A are very

5 309 Fig. 2. A phylogenetic tree based on all 16S rrna gene sequences from 19 completely sequenced genomes. The DNA sequences of 16S rrna genes were aligned using CLUSTAL W and the distance values (number of substitutions per site) were computed using DNADIST program (Felsenstein 1993) according to the Kimura two-parameter model with the transition to transversion ratio of 2.0. The tree was constructed using neighbor joining method as implemented in program NEIGHBOR in the PHYLIP package (Felsenstein 1993). The paralogous genes within a genome are shaded and the average distance among them is indicated next to the shaded area. The species names are abbreviated: Aae: A. aeolicus, Afu: A. fulgidus, Bsu: B. subtilis, Bbu: B. burgdorferi, Ctr: C. trachomatis, Cpn: C. pneumoniae, Eco: E. coli, Hin: H. influenzae, Hpy: H. pylori 26695, Hpy-J99: H. pylori J99, Mth: M. thermoautotrophicum, Mja: M. jannaschii, Mtu: M. tuberculosis, Mge: M. genitalium, Mpn: M. pneumoniae, Pho: P. horikoshii, Rpr: R. prowazekii, Syn: Synechocystis PCC6803, Tpa: T. pallidum. The multiple sequence alignment used for constructing this tree is available on request. close to each other only 150 bp apart the conversion between rrsd and rrsg in the two regions may not be due to a single event because there are three nucleotide differences between rrsg and rrsd in the sequence between the two illustrated regions (from position 98 to position 245). Moreover, there is no evidence of recent sequence conversion between sequences 5 to the rrsd and rrsg. In fact, rrsd and rrsg differ significantly from each other in the 5 flanking sequences (Fig. 3D). Also, several independent gene conversion events that convert short sequence stretches are evident in other regions between operons rrnd and rrng. One can see a putative conversion tract in the intergenic spacers between 16S and 23S genes of rrnd and rrng around the position 430,

6 310 depicted in Fig. 3C. This short sequence stretch of only 18 bp identifies two deletion tracts of 5 and 4 bps, and two tracts of substitutions of 2 bps that are shared between rrnd and rrng. However, the overall sequence similarity in the intergenic spacers between the two operons are minimal, especially between position 60 and 106, as shown in Fig. 3C. Furthermore, rrnd has two trna genes (trna Ile and trna Ala ), whereas rrng has only one trna gene (trna Glu ) in the 16S 23S intergenic regions. Similarly, gene conversion may have also taken place in between positions 2795 and 2815 of rrld and rrlg (Fig. 3B); again, this tract appears to be short because rrld suffered several rare single nucleotide deletions in the region close to the conversion tract depicted in Fig. 3B (the first deletion at position 2714, just 82 bps 5 to position 2795). Thus, we conclude that sequence conversion between paralogous rrna genes is patchy and occurs in discontinuous tracts throughout the genic regions as well as in the immediate flanking sequences in multiple conversion events. Rapid Divergence of Sequences Flanking rrna Genes Sequences immediately flanking paralogous rrna genes in a genome are also highly homogenous, suggesting that they are also subject to homogenization. For example, the sequences of about 150 bps immediately 5 to the 16S rrna genes among the seven operons in E. coli are virtually identical (with only two transitions). I note, however, that the flanking sequences generally accumulate more substitutions than the genic sequences. Furthermore, the longer the distance between the flanking sequence and the genic region is, the more substitutions we see in the flanking regions. In E. coli, although the 150-bp segments immediately flanking the 16S rrna gene are homogenous in all operons (the black-filled rectangle in Fig. 3D), the sequences further upstream are quite heterogeneous. Operons rrna, -B, -C, and -G share about 210-bp homogenous sequences further upstream (gray-filled rectangle in Fig. 3D), whereas operons rrnd, -E, and -H share a segment of about 150 bps (crosshatched rectangle in Fig. 3D), which differs significantly from the gray-filled areas, although a reliable alignment between these regions can still be established. Similar situations were also observed in the 16S 23S intergenic spacer (Fig. 3C) and 3 flanking sequences of rrf genes (Fig. 3E). To quantitatively compare the sequence divergence between genic sequences and flanking regions, we have computed the distances between paralogous sequences for different regions. To avoid overestimating the distance values, we considered only the portions of the flanking sequences for which a reliable multiple sequence alignment can be obtained, which nonetheless might underestimate the real divergence, as this measure might have excluded additional nucleotide changes in the

7 311 Fig. 3. Gene conversion tracts between paralogous rrna operons in the E. coli genome. Segments from different regions of the seven rrn operons were aligned. Shared nucleotides due to sequence conversion in each segment are depicted in white boxes except between position 60 and 106 in panel C, where accurate alignment is not possible. The positions of the first, last nucleotides or a reference position in between for the illustrated segments are indicated on top. A Selected segments from the 16S rrna genes (rrs). B Selected segments from the 23S rrna genes (rrl). C The intergenic spacers between 16S and 23S rrna genes. Between positions 60 and 106, sequences sharing high level of identities are differently marked: rrnb and rrng, shaded: rrnc and rrne, indicated with asterisk; rrna, rrnd, rrnh, boxed. The sequences between positions 106 and 338 contain different trna genes, and are indicated with dashed line. D Sequences 5 to the rrs genes. The homogenous sequences in all operons are depicted with black rectangles; further upstream sequences are not homogenous, but high identities are found among different operons and marked either with gray rectangles (operons rrna, rrnb, rrnc, and rrng), or hatched rectangles (rrnd, rrne, and rrnh). The approximate length of each segment is indicated. Unrelated flanking sequences are denoted by thick dashed lines. E The 3 flanking sequences of rrf genes. The 5S genes (rrf) are shown as white rectangles. A 5-bp segment immediately 3 to the rrf genes is shared in all operons and shown as black rectangles. Further downstream sequences are not shared by all operons. A segment of about 210 bps is shared by three operons (rrna, rrnb, and rrnf) and depicted with a gray rectangle. Note that rrnf contains rrff only, which lies 3 to the rrnd operon, separated only by 135 bps (also see Table 1). Two operons, rrnc and rrnh, share a segment of about 130 bps as depicted in cross-hatched rectangles. Sizes of each shared fragment are indicated. Unrelated flanking sequences are shown as thick dashed lines.

8 312 distance calculations. Figure 4 shows bar graphs of average distance values for different segments of the paralogous operons in E. coli (Fig. 4A), B. subtilis (Fig. 4B), and H. pylori (Fig. 4C). We can see clearly that the average distances between genic sequences are much smaller than that between flanking sequences in almost all instances. Comparison of rrna operons in two strains of H. pylori indicated that divergence is dramatically accelerated in the region flanking rrna genes. H. pylori strains and J99 were both sequenced completely (Alm et al. 1999; Tomb et al. 1997). In both strains, all three rrna genes exist in two copies, and 16S rrna genes are not linked to 23S 5S rrna operons (see Table 1). The genomes of both strains are highly conserved not only in gene sequences but also in overall genomic structure, consistent with a low level of evolutionary divergence between them (Alm et al. 1999). Indeed, the genic sequences of 5S rrna genes (129 bp long) are identical, and those of 16S (1501 bp) and 23S rrnas (2975 bp) differ only in a few positions between the two strains: there are 12 substitutions (10 transitions, 2 indels) between the 16S genes and 17 substitutions (9 transitions, 7 transversion, and 1 indel) between the 23S rrna genes. We then analyzed the sequences flanking the orphaned 16S genes and those flanking 23S and 5S genes as well as the 23S 5S intergenic spacers. Remarkably, although the flanking sequences are virtually identical within each strain, they have undergone striking divergence, even in the short 23S 5S intergenic spacers (230 bps) (Fig. 4C). The extreme divergence value between the 5 flanking sequences of the 23S rrna genes shown in Fig. 4C is not an exaggeration but might reflect the fact that the homogenous sequences are extremely long in this region within a strain (over 7 kb identical sequences preceding the two 23S rrna genes in strain 26695), and the alignable sequences are also relatively long between the two strains (about 1 kb). Figure 4E shows a multiple sequence alignment of sequences 5 to the 16S rrna genes in both H. pylori strains. The aligned region of 515 bps contains 24 transitions, 12 transversions, and 8 indels. The ratio of transitions to transversion appears normal. However, the frequent indel events are unexpected, as indels are generally considered very rare events. This pattern of mutations could result from gene conversion between two significantly divergent flanking sequences, as it was shown that recombination between divergent DNA molecules can cause dramatic deletion, insertion, and extension of conversion tract into nonhomologous flanking region (Bailis et al. 1992; Belmaaza et al. 1994). Regardless of the mechanisms that generated the remarkable flanking sequence change, the high divergence values of the flanking sequences compared with low divergence of the genic sequences between the two H. pylori strains must reflect that gene conversion responsible for homogenization occurs much more frequently within the genic region, as spontaneous mutations arise at approximately equal frequency for both coding and noncoding sequences. To further illustrate the differential divergence rates for genic and flanking sequences, I compared the flanking sequences of the rrna genes between E. coli and H. influenzae. I note that the sequences flanking rrna genes are rarely conserved between species. Nevertheless, short stretches of sequences immediately flanking the rrna genic regions can still be reliably aligned between E. coli and H. influenzae. The divergence of various regions in the ribosomal operons between the two bacterial genomes is shown in Fig. 4D. Again, the flanking sequences exhibit much higher divergence than the genic regions. Discussion Concerted evolution of multigene families is a universal phenomenon. It operates not only in complex genomes where multigene families are abundant but also in relatively simple genomes of prokaryotes where multigene families are rare. Here I have analyzed the rrna gene families in 19 completely sequenced genomes of Bacteria and Archaea. The availability and analysis of the DNA sequences of all members of a multigene family in multiple species are unprecedented and have provided a picture of the extent and mode of concerted evolution of multigene families. This study suggests that gene conversion initiated within the genic regions of the rrna genes is likely to play the major role in sequence homogenization and hence concerted evolution of dispersed rrna gene families. Gene Conversion in Concerted Evolution Several molecular mechanisms can lead to the apparent sequence conversion observed in the paralogous rrna genes, including (1) sequence conversion via reverse transcription (RT) of rrna and subsequent homologous recombination of cdna of one particular rrna molecule with nonallelic rrna genes in the genome; (2) recombination between nonallelic rrna genes (via reciprocal or copy choice mechanism); and (3) singlestranded DNA invasion of nonallelic genes and subsequent heteroduplex formation. Sequence conversion can ensue from a heteroduplex that might be resolved via DNA repair or through segregation of unrepaired heteroduplex in the next round of replication. The first two mechanisms are unlikely to be responsible for homogenization of the rrna genes for a number of reasons. First, although RT-mediated gene conversion appears to occur in S. cerevisiae (Derr and Strathern 1993), reverse transcriptase activity cannot be detected in many differ-

9 313 Fig. 4. Rapid divergence of sequences flanking rrna genes. Distance values (number of substitutions per site) were computed as described in Materials and Methods for different segments of the paralogous operons in a genome. For nongenic flanking sequences, only the portions for which a reliable multiple sequence alignment can be obtained were considered for distance calculation. This measure may nonetheless underestimate the real divergence. The vertical bars represent an average of all pairwise distances among paralogous segments, solid bars, divergence among genic sequences; gray bars, divergence of flanking sequences. The prefix pre- and post- indicate 5 or 3 flanking sequences of an rrna gene. A Sequence divergence among seven rrn operons in the E. coli K12 genome. The divergence of 3 flanking sequences of rrf (post- 5S) was based on the alignment of the homologous flanks of rrfa, rrfb, and rrff (also see Figure 3E). B Sequence divergence among 10 rrn operons in the B. subtilis genome. The divergence of 3 flanking sequences of rrf (post-5s) was based on the alignment of the homologous flanks of rrfa, rrfg and rrfo, and rrfw. C Sequence divergence between rrn operons in two H. pylori strains (J99 and 26695). The genic and flanking sequences analyzed here are identical within the genome of each strain (see panel E). No divergence was detected in four 5S genes in both strains. The 16S rrna genes are not linked to 23S 5S operons, so both flanks of 16S gene and the 5 flank of 23S gene were examined. D Comparison of rrn operons between E. coli and H. influenzae. The 3 flanking sequences of rrf genes cannot be aligned between the two species; thus, they were not included in the graph. E Alignment of 5 flanking sequences of the 16S rrna genes of H. pylori strains J99 and Divergent nucleotides are highlighted with gray shade.

10 314 ent types of cells (Inouye and Inouye 1991), including E. coli K12, from which the complete E. coli genome sequence was derived. Yet in this strain gene conversion among the paralogous rrna genes examined in this study is quite evident. Thus, RT-mediated recombination may not be involved in homogenization of multigene families. Second, unequal reciprocal recombination can, in principle, account for homogenization of tandemly repeated genes. However, nonconversion recombination mechanisms (reciprocal or via copy-choice) could not satisfactorily explain the remarkable heterogeneity of sequences flanking rrna genes (see Fig. 3) because one would expect sharp and homogenous junctions between homogenized sequences and flanking chromosomal DNA if homogenization is primarily achieved through repeated rounds of such recombinations. Furthermore, ectopic recombination between repetitive sequences can result in sequence deletion, inversion, or translocation, and such drastic genomic changes lead to genome instability and are often deleterious to the cells. Several lines of evidence suggest that gene conversion via heteroduplex formation plays a major role in homogenization of dispersed multigene families. First, homogenization of the two paralogous tuf genes in Salmonella typhimurium was suggested to be the result of gene conversion via a RecBCD-dependent mechanism (Abdulkarim and Hughes 1996). In E. coli and Salmonella, recombination hotspot Chi (5 -GCTGGTGG-3 ) isimplicated in RecBCD-mediated recombination. The Chi element is one of the most abundant repetitive oligomers in the E. coli genome (Blattner et al. 1997). Interestingly, Chi-like sequences are frequently found within the 16S and 23S rrna genes and their vicinities. For example, the sequence stretch GCTGGCGG near the 5 end of the 16S rrna gene differs from Chi by only one nucleotide at the fifth position, and this change does not appear to affect Chi function (Schultz et al. 1981). Moreover, this sequence stretch is conserved in all bacterial 16S rrna genes. Thus, it appears likely that RecBCD-mediated gene conversion might also be involved in sequence homogenization at the rrna loci. It should be noted that RecBCD/Chi system may not operate in all of these species. Nonetheless, a similar recombination machinery may be responsible for concerted evolution in other bacterial species. Second, in S. cerevisiae, gene conversion via heteroduplex formation also occurs between dispersed multigenes (Nag and Petes 1990). Similarly, genetic exchange between tandem arrayed rrna genes in S. cerevisiae is Rad52-dependent and thus may proceed via gene conversion-like mechanism (Gangloff et al. 1996). Third, analysis of the RNU2 locus in various human populations reveals that repeats within an individual U2 tandem array are more homogenous than between different arrays, which could be explained by frequent intrachromosomal recombination, such as unequal sister chromatid exchange (USCE) and/or intrachromatid gene conversion. Interestingly, concerted evolution does not lead to exchange of markers flanking the U2 tandem arrays in homologous chromosomes (Liao et al. 1997). Thus, gene conversion, not unequal crossing over, is responsible for interchromosomal recombination. In addition, detailed sequence analyses of the junction regions flanking the primate RNU2 locus have documented dramatic rearrangements and divergence of the primate RNU2 junctions (Pavelitz et al. 1999). Simple reciprocal exchanges between homologous sequences could not readily explain faster evolution of junction regions of a multigene family. Indeed, junction rearrangement and divergence of the primate RNU2 locus can be viewed as a consequence of homogenization of U2 tandem arrays by gene conversion (Pavelitz et al. 1999). Likewise, junction sequence heterogeneity and rapid divergence observed here in the bacterial rrna loci can also be explained as a consequence of gene conversion. Gene conversion near the boundaries of the rrna operons can be viewed as one-sided homologous recombination (Belmaaza and Chartrand 1994). It was shown that exogenous linear DNA carrying one homologous and one nonhomologous segment invades chromosomal DNA, and strand invasion and subsequent homologous pairing occur within the homologous segment. Sequence divergence as high as 15% between donor and recipient does not seem to interfere with initiation of recombination, as the initial stage of homologous recombination requires only a minimum length of homology (Shen and Huang 1986). Resolution of the ensuing recombination intermediates at the nonhomologous end, however, leads to illegitimate recombination, which frequently causes significant deletion, duplication, and extension of conversion tract into adjacent, nonhomologous region in the recipient DNA (Belmaaza et al. 1994; Dellaire et al. 1997; Pavelitz et al. 1999). As flanking sequences of the rrna genes diverge at a much higher rate than the genic sequences (Fig. 4), significant heterogeneity might have already been accumulated in the flanks at the time of gene conversion. Thus, one could envision that recombination in the flanking regions could further accelerate flanking sequence divergence. The gradual loss of homogenous flanking sequences as well as accelerated mutations in the junction regions at the rrna loci (see Fig. 3D and E) are consistent with the one-sided homologous recombination model. Fourth, the observed short and noncontiguous conversion tracts at the prokaryotic rrna loci are also consistent with a role for gene conversion in concerted evolution of paralogous rrna genes, as such patterns of conversion were detected in sequence transfer between the tuf genes in S. typhimurium (Abdulkarim and Hughes 1996). Discontinuous and short tracts of sequence conversion might reflect the intrinsic cellular mechanisms that may limit the length of conversion tracts. Mismatch

11 315 repair enzymes, such as MutL and MutS, are the major barrier for recombination between divergent sequences (Matic et al. 1995). They may block heteroduplex extension, as gene conversion tracts in a mismatch repairdefective strain of S. cerevisiae are significantly longer than those in an isogenic wild-type strain (Chen and Jinks-Robertson 1998, 1999). Additionally, the conversion tracts induced by DSBs in mouse embryonic stem cells are very short; most of them are less than 58 bps (Elliott et al. 1998). Further support for gene conversion in concerted evolution comes from studies on rdna sequences in triploid parthenogenetic lines of lizards. It was found that homogenization of the rdna arrays always proceeded in the same direction. Thus, biased gene conversion is likely to be responsible for concerted evolution of the rdna arrays in hybrid lines of lizards (Hillis et al. 1991). Collectively, these observations strongly suggest that gene conversion underlies concerted evolution of the paralogous rrna genes in prokaryotes. One might be concerned that the observed sequence conversion tracts in these microbial genomes may be due to errors in computer assignments of repetitive sequences within a sequenced genome, as it is possible that one sequenced segment might be assigned to more than one locus. Though the possibility of misassignments of highly repeated sequences cannot be excluded in any whole genome sequence, this is unlikely to compromise the conclusion because the most informative conversion tracts are found in the E. coli and B. subtilis genomes, whose rrna operons were already sequenced using conventional protocols before the two genomes were completely sequenced. Furthermore, the observed conversion tracts are much shorter than a single sequence read from a typical automatic sequencing run; thus, they could not be the results of sequence assembly mistakes. Differential Conversion Domains This study found that the flanking sequences of the rrna operons undergo much faster divergence than the genic regions. This is unexpected if mechanisms leading to homogenization of all members of a multigene family within a genome operate indiscriminately on both genic sequence and flanking sequences. One can invoke functional constraints or purifying selection on rrna molecules in order to explain why genic regions evolve more slowly than the flanking sequences, because mutations in flanking sequences may be inconsequential. The flanking sequences of bacterial ribosomal operons might indeed provide no biological functions because there is apparently no sequence conservation between flanking sequences of the rrna genes in different species. However, functional constraints alone cannot explain why the genic sequences of the paralogous genes are virtually identical within a genome, because there are obviously regions within the rrna molecules that can tolerate mutations, otherwise there would be little divergence between the rrna genes in different species. In this regard, it is noteworthy that foreign rrna genes with substantial sequence divergence suffice to support growth of an E. coli strain in which all resident rrna operons were inactivated (Asai et al. 1999). Thus, small variations or heterogeneity in the rrna genes do not affect the fitness of an organism, further indicating that purifying selection alone cannot account for the degree of sequence homogeneity of the rrna genes within a species. One possible explanation is that gene conversion responsible for homogenization occurs normally within the genic sequence. In this way, concerted evolution can quickly purge or spread mutations arising within the genic regions. The homogenized flanking sequences may merely be the consequence of gene conversion initiated in the genic sequence. Likewise, this could explain why sequences immediately flanking the genic sequence are more homogenous than distal flanking sequences of the rrna genes among the paralogous loci within a genome. It is unknown why gene conversion primarily occurs in the genic sequences. Active transcription at the rrna operons could potentially increase gene conversion at these loci, as it is well known that transcription can stimulate recombination (Gangloff et al. 1994). However, although the 16S 23S intergenic spacers are cotranscribed with the rrna genes, remarkable sequence heterogeneity exists in the spacer regions (Fig. 3C), suggesting that factors other than transcription may favor the genic regions as donor or recipient in gene conversion. Whatever mechanism is responsible for the differential conversion between genic sequences and their flanks, these results indicate that the genic regions of the rrna genes may be the sole cis elements that determine their concerted evolution, and flanking sequences and genomic context play little, if any, roles in this process. Gene conversion between the paralogous rrna genes is the most likely mechanism for concerted evolution of multiple-copy prokaryotic rrna genes in a genome, but one less likely alternative mechanism to explain the sequence homogeneity of rrna genes within a genome is gene duplication and ectopic reintegration into the genome, as suggested in the gene amplification model (Weiner and Denison 1983). Indeed, the six rrna operons in H. influenzae reside in nonequivalent loci when compared to the locations of the seven E. coli operons (see Fig. 1). Similarly, the relative positions of rrna operons in the genomes of archaeal species, such as M. jannaschii and M. thermoautotrophicum, are not conserved. However, genomic organizations or long-range gene orders are rarely conserved in prokaryotic organisms. For example, although a majority of the genes in H. influenzae (85% of the total genome) share significant sequence identity with their homologues in E. coli, there is no evidence for conservation of gene positions within

12 316 the genome between the two species (de Rosa and Labedan 1998; Tatusov et al. 1996). In fact, extensive chromosomal rearrangement events have shuffled the order of many genes. Interestingly, although the overall genomic organization is almost completely conserved between H. pylori strains and J99, and the position of one 23S 5S gene cluster is the same in both strains, the DNA fragment following the 5S gene in the other rrna operon appears to have suffered rearrangement, as the surrounding genes are different between the two species. As there is apparent evidence of concerted evolution between the rrna genes within each strain (see Results), DNA fragment rearrangement may be a consequence of recent gene conversion during concerted evolution, as discussed above. In summary, analysis of all paralogous rrna genes in 19 completely sequenced genomes of Bacteria and Archaea revealed striking patterns of concerted evolution within a genome. Gene conversion appears to play the major role in sequence homogenization. Moreover, gene conversion primarily occurs within the genic regions, and flanking sequences or genomic context of rrna genes are not required for concerted evolution. Together with studies of concerted evolution in eukaryotic organisms, it appears that concerted evolution probably operates through similar mechanisms in all forms of life. Acknowledgments. This work was supported by Medical Research Council of Canada grant MOP The author is a Chercheur- Boursier Junior I of the Fonds de la Recherche en Santé duquébec (FRSQ). References Abdulkarim F, Hughes D (1996) Homologous recombination between the tuf genes of Salmonella typhimurium. J Mol Biol 260: Alm RA, Ling LS, Moir DT, King BL, Brown ED, Doig PC, Smith DR, Noonan B, Guild BC, dejonge BL, Carmel G, Tummino PJ, Caruso A, Uria-Nickelsen M, Mills DM, Ives C, Gibson R, Merberg D, Mills SD, Jiang Q, Taylor DE, Vovis GF, Trust TJ (1999) Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397: Amstutz H, Munz P, Heyer WD, Leupoid U, Kohli J (1985) Concerted evolution of trna genes: intergenic conversion among three unlinked serine trna genes in S. pombe. Cell 40: Arnheim N, Krystal M, Schmickel R, Wilson G, Ryder O, Zimmer E (1980) Molecular evidence for genetic exchanges among ribosomal genes on nonhomologous chromosomes in man and apes. Proc Natl Acad Sci USA 77: Asai T, Zaporojets D, Squires C, Squires CL (1999) An Escherichia coli strain with all chromosomal rrna operons inactivated: complete exchange of rrna genes between bacteria. Proc Natl Acad Sci USA 96: Bailis AM, Arthur L, Rothstein R (1992) Genome rearrangement in top3 mutants of Saccharomyces cerevisiae requires a functional RAD1 excision repair gene. Mol Cell Biol 12: Belmaaza A, Chartrand P (1994) One-sided invasion events in homologous recombination at double-strand breaks. Mutat Res 314: Belmaaza A, Milot E, Villemure JF, Chartrand P (1994) Interference of DNA sequence divergence with precise recombinational DNA repair in mammalian cells. EMBO J 13: Blattner FR, Plunkett G III, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y (1997) The complete genome sequence of Escherichia coli K-12. Science 277: Brown DD, Wensink PC, Jordan E (1972) A comparison of the ribosomal DNAs of Xenopus laevis and Xenopus mulleri: the evolution of tandem genes. J Mol Biol 63:57 73 Chen W, Jinks-Robertson S (1998) Mismatch repair proteins regulate heteroduplex formation during mitotic recombination in yeast. Mol Cell Biol 18: Chen W, Jinks-Robertson S (1999) The role of the mismatch repair machinery in regulating mitotic and meiotic recombination between diverged sequences in yeast. Genetics 151: Coen E, Strachan T, Dover G (1982) Dynamics of concerted evolution of ribosomal DNA and histone gene families in the melanogaster species subgroup of Drosophila. J Mol Biol 158:17 35 de Rosa R, Labedan B (1998) The evolutionary relationships between the two bacteria Escherichia coli and Haemophilus influenzae and their putative last common ancestor. Mol Biol Evol 15:17 27 Dellaire G, Lemieux N, Belmaaza A, Chartrand P (1997) Ectopic gene targeting exhibits a bimodal distribution of integration in murine cells, indicating that both intra- and interchromosomal sites are accessible to the targeting vector. Mol Cell Biol 17: Derr LK, Strathern JN (1993) A role for reverse transcripts in gene conversion. Nature 361: Dover G (1982) Molecular drive: a cohesive mode of species evolution. Nature 299: Edelman GM, Gally JA (1970) Arrangement and evolution of eukaryotic genes. In: Schmitt FO (ed) The neurosciences: second study program. New York: Rockefeller University Press, pp Elder JF Jr, Turner BJ (1995) Concerted evolution of repetitive DNA sequences in eukaryotes. Q Rev Biol 70: Elliott B, Richardson C, Winderbaum J, Nickoloff JA, Jasin M (1998) Gene conversion tracts from double-strand break repair in mammalian cells. Mol Cell Biol 18: Felsenstein J (1993) PHYLIP (phylogeny inference package). Department of Genetics, University of Washington, Seattle Glangloff S, Lieber MR, Rothstein R (1994) Transcription, topoisomerases and recombination. Experientia 50: Gangloff S, Zou H, Rothstein R (1996) Gene conversion plays the major role in controlling the stability of large tandem repeats in yeast. EMBO J 15: Gonzalez IL, Sylvester JE (1995) Complete sequence of the 43-kb human ribosomal DNA repeat: analysis of the intergenic spacer. Genomics 27: Hillis DM, Dixon MT (1991) Ribosomal DNA: molecular evolution and phylogenetic inference. Q Rev Biol 66: Hillis DM, Moritz C, Porter CA, Baker RJ (1991) Evidence for biased gene conversion in concerted evolution of ribosomal DNA. Science 251: Inouye M, Inouye S (1991) msdna and bacterial reverse transcriptase. Ann Rev Microbiol 45: Liao D (1999) Concerted evolution: molecular mechanisms and biological implications. Am J Hum Genet 64:24 30 Liao D, Pavelitz T, Kidd JR, Kidd KK, Weiner AM (1997) Concerted evolution of the tandemly repeated genes encoding human U2 sn- RNA (the RNU2 locus) involves rapid intrachromosomal homogenization and rare interchromosomal gene conversion. EMBO J 16: Matic I, Rayssiguier C, Radman M (1995) Interspecies gene exchange in bacteria: the role of SOS and mismatch repair systems in evolution of species. Cell 80: Morzycka-Wroblewska E, Selker EU, Stevens JN, Metzenberg RL (1985) Concerted evolution of dispersed Neurospora crassa 5S

13 317 RNA genes: pattern of sequence conservation between allelic and nonallelic genes. Mol Cell Biol 5:46 51 Nag DK, Petes TD (1990) Meiotic recombination between dispersed repeated genes is associated with heteroduplex formation. Mol Cell Biol 10: Nagylaki T, Petes TD (1982) Intrachromosomal gene conversion and the maintenance of sequence homogeneity among repeated genes. Genetics 100: Ohta T (1976) Simple model for treating evolution of multigene families. Nature 263:74 76 Ohta T, Dover GA (1983) Population genetics of multigene families that are dispersed into two or more chromosomes. Proc Natl Acad Sci USA 80: Pavelitz T, Liao D, Weiner AM (1999) Concerted evolution of the tandem array encoding primate U2 snrna (the RNU2 locus) is accompanied by dramatic remodeling of the junctions with flanking chromosomal sequences. EMBO J 18: Sakai K, Ohta T, Minoshima S, Kudoh J, Wang Y, de Jong PJ, Shimizu N (1995) Human ribosomal RNA gene cluster: identification of the proximal end containing a novel tandem repeat sequence. Genomics 26: Schlötterer C, Tautz D (1994) Chromosomal homogeneity of Drosophila ribosomal DNA arrays suggests intrachromosomal exchanges drive concerted evolution. Curr Biol 4: Schultz DW, Swindle J, Smith GR (1981) Clustering of mutations inactivating a Chi recombinational hotspot. J Mol Biol 146: Seperack P, Slatkin M, Arnheim N (1988) Linkage disequilibrium in human ribosomal genes: implications for multigene family evolution. Genetics 119: Shen P, Huang HV (1986) Homologous recombination in Escherichia coli: dependence on substrate length and homology. Genetics 112: Smith GP (1976) Evolution of repeated DNA sequences by unequal crossover. Science 191: Swofford DL (1993) PAUP: phylogenetic analysis using parsimony. Smithsonian Institution, Washington, DC Szostak JW, Wu R (1980) Unequal crossing over in the ribosomal DNA Of Saccharomyces cerevisiae. Nature 284: Tatusov RL, Mushegian AR, Bork P, Brown NP, Hayes WS, Borodovsky M, Rudd KE, Koonin EV (1996) Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli. Curr Biol 6: Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B, Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fitzegerald LM, Lee N, Adams MD, Venter JC, et al. (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388: Warner JR (1989) Synthesis of ribosomes in Saccharomyces cerevisiae. Microbiol Rev 53: Weiner AM, Denison RA (1983) Either gene amplification or gene conversion may maintain the homogeneity of the multigene family encoding human U1 small nuclear RNA. Cold Spring Harb Symp Quant Biol 47:

2 Genome evolution: gene fusion versus gene fission

2 Genome evolution: gene fusion versus gene fission 2 Genome evolution: gene fusion versus gene fission Berend Snel, Peer Bork and Martijn A. Huynen Trends in Genetics 16 (2000) 9-11 13 Chapter 2 Introduction With the advent of complete genome sequencing,

More information

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3 The Minimal-Gene-Set -Kapil Rajaraman(rajaramn@uiuc.edu) PHY498BIO, HW 3 The number of genes in organisms varies from around 480 (for parasitic bacterium Mycoplasma genitalium) to the order of 100,000

More information

MOLECULAR EVOLUTION 99 Concerted Evolution: Molecular Mechanism and Biological Implications

MOLECULAR EVOLUTION 99 Concerted Evolution: Molecular Mechanism and Biological Implications Am. J. Hum. Genet. 64:24 30, 1999 MOLECULAR EVOLUTION 99 Concerted Evolution: Molecular Mechanism and Biological Implications Daiqing Liao Department of Microbiology and Infectious Diseases, Faculty of

More information

Introduction to Bioinformatics Integrated Science, 11/9/05

Introduction to Bioinformatics Integrated Science, 11/9/05 1 Introduction to Bioinformatics Integrated Science, 11/9/05 Morris Levy Biological Sciences Research: Evolutionary Ecology, Plant- Fungal Pathogen Interactions Coordinator: BIOL 495S/CS490B/STAT490B Introduction

More information

Midterm Exam #1 : In-class questions! MB 451 Microbial Diversity : Spring 2015!

Midterm Exam #1 : In-class questions! MB 451 Microbial Diversity : Spring 2015! Midterm Exam #1 : In-class questions MB 451 Microbial Diversity : Spring 2015 Honor pledge: I have neither given nor received unauthorized aid on this test. Signed : Name : Date : TOTAL = 45 points 1.

More information

Comparative genomics: Overview & Tools + MUMmer algorithm

Comparative genomics: Overview & Tools + MUMmer algorithm Comparative genomics: Overview & Tools + MUMmer algorithm Urmila Kulkarni-Kale Bioinformatics Centre University of Pune, Pune 411 007. urmila@bioinfo.ernet.in Genome sequence: Fact file 1995: The first

More information

# shared OGs (spa, spb) Size of the smallest genome. dist (spa, spb) = 1. Neighbor joining. OG1 OG2 OG3 OG4 sp sp sp

# shared OGs (spa, spb) Size of the smallest genome. dist (spa, spb) = 1. Neighbor joining. OG1 OG2 OG3 OG4 sp sp sp Bioinformatics and Evolutionary Genomics: Genome Evolution in terms of Gene Content 3/10/2014 1 Gene Content Evolution What about HGT / genome sizes? Genome trees based on gene content: shared genes Haemophilus

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Base Composition Skews, Replication Orientation, and Gene Orientation in 12 Prokaryote Genomes

Base Composition Skews, Replication Orientation, and Gene Orientation in 12 Prokaryote Genomes J Mol Evol (1998) 47:691 696 Springer-Verlag New York Inc. 1998 Base Composition Skews, Replication Orientation, and Gene Orientation in 12 Prokaryote Genomes Michael J. McLean, Kenneth H. Wolfe, Kevin

More information

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17. Genetic Variation: The genetic substrate for natural selection What about organisms that do not have sexual reproduction? Horizontal Gene Transfer Dr. Carol E. Lee, University of Wisconsin In prokaryotes:

More information

Genetically Engineering Yeast to Understand Molecular Modes of Speciation

Genetically Engineering Yeast to Understand Molecular Modes of Speciation Genetically Engineering Yeast to Understand Molecular Modes of Speciation Mark Umbarger Biophysics 242 May 6, 2004 Abstract: An understanding of the molecular mechanisms of speciation (reproductive isolation)

More information

Fitness constraints on horizontal gene transfer

Fitness constraints on horizontal gene transfer Fitness constraints on horizontal gene transfer Dan I Andersson University of Uppsala, Department of Medical Biochemistry and Microbiology, Uppsala, Sweden GMM 3, 30 Aug--2 Sep, Oslo, Norway Acknowledgements:

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Frequently Asked Questions (FAQs)

Frequently Asked Questions (FAQs) Frequently Asked Questions (FAQs) Q1. What is meant by Satellite and Repetitive DNA? Ans: Satellite and repetitive DNA generally refers to DNA whose base sequence is repeated many times throughout the

More information

TE content correlates positively with genome size

TE content correlates positively with genome size TE content correlates positively with genome size Mb 3000 Genomic DNA 2500 2000 1500 1000 TE DNA Protein-coding DNA 500 0 Feschotte & Pritham 2006 Transposable elements. Variation in gene numbers cannot

More information

Evolutionary Analysis by Whole-Genome Comparisons

Evolutionary Analysis by Whole-Genome Comparisons JOURNAL OF BACTERIOLOGY, Apr. 2002, p. 2260 2272 Vol. 184, No. 8 0021-9193/02/$04.00 0 DOI: 184.8.2260 2272.2002 Copyright 2002, American Society for Microbiology. All Rights Reserved. Evolutionary Analysis

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Evolutionary Use of Domain Recombination: A Distinction. Between Membrane and Soluble Proteins

Evolutionary Use of Domain Recombination: A Distinction. Between Membrane and Soluble Proteins 1 Evolutionary Use of Domain Recombination: A Distinction Between Membrane and Soluble Proteins Yang Liu, Mark Gerstein, Donald M. Engelman Department of Molecular Biophysics and Biochemistry, Yale University,

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

ABSTRACT. As a result of recent successes in genome scale studies, especially genome

ABSTRACT. As a result of recent successes in genome scale studies, especially genome ABSTRACT Title of Dissertation / Thesis: COMPUTATIONAL ANALYSES OF MICROBIAL GENOMES OPERONS, PROTEIN FAMILIES AND LATERAL GENE TRANSFER. Yongpan Yan, Doctor of Philosophy, 2005 Dissertation / Thesis Directed

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Essential Genes Are More Evolutionarily Conserved Than Are Nonessential Genes in Bacteria

Essential Genes Are More Evolutionarily Conserved Than Are Nonessential Genes in Bacteria Letter Essential Genes Are More Evolutionarily Conserved Than Are Nonessential Genes in Bacteria I. King Jordan, Igor B. Rogozin, Yuri I. Wolf, and Eugene V. Koonin 1 National Center for Biotechnology

More information

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting. Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction

More information

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid. 1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the

More information

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016 Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,

More information

MiGA: The Microbial Genome Atlas

MiGA: The Microbial Genome Atlas December 12 th 2017 MiGA: The Microbial Genome Atlas Jim Cole Center for Microbial Ecology Dept. of Plant, Soil & Microbial Sciences Michigan State University East Lansing, Michigan U.S.A. Where I m From

More information

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p. READING: 14.2 Bacterial Genomes p. 481 14.3 Gene Transfer Mechanisms in Bacteria p. 486 Suggested Problems: 1, 7, 13, 14, 15, 20, 22 BACTERIAL GENETICS AND GENOMICS We still consider the E. coli genome

More information

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA

INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA INTERACTIVE CLUSTERING FOR EXPLORATION OF GENOMIC DATA XIUFENG WAN xw6@cs.msstate.edu Department of Computer Science Box 9637 JOHN A. BOYLE jab@ra.msstate.edu Department of Biochemistry and Molecular Biology

More information

Effects of Gap Open and Gap Extension Penalties

Effects of Gap Open and Gap Extension Penalties Brigham Young University BYU ScholarsArchive All Faculty Publications 200-10-01 Effects of Gap Open and Gap Extension Penalties Hyrum Carroll hyrumcarroll@gmail.com Mark J. Clement clement@cs.byu.edu See

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome

The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome. The minimal prokaryotic genome Dr. Dirk Gevers 1,2 1 Laboratorium voor Microbiologie 2 Bioinformatics & Evolutionary Genomics The bacterial species in the genomic era CTACCATGAAAGACTTGTGAATCCAGGAAGAGAGACTGACTGGGCAACATGTTATTCAG GTACAAAAAGATTTGGACTGTAACTTAAAAATGATCAAATTATGTTTCCCATGCATCAGG

More information

Bio 119 Bacterial Genomics 6/26/10

Bio 119 Bacterial Genomics 6/26/10 BACTERIAL GENOMICS Reading in BOM-12: Sec. 11.1 Genetic Map of the E. coli Chromosome p. 279 Sec. 13.2 Prokaryotic Genomes: Sizes and ORF Contents p. 344 Sec. 13.3 Prokaryotic Genomes: Bioinformatic Analysis

More information

Repeated Sequences SOPHIE BACHELLIER, ERIC GILSON, MAURICE HOFNUNG, AND CHARLES W. HILL

Repeated Sequences SOPHIE BACHELLIER, ERIC GILSON, MAURICE HOFNUNG, AND CHARLES W. HILL Repeated Sequences SOPHIE BACHELLIER, ERIC GILSON, MAURICE HOFNUNG, AND CHARLES W. HILL 112 INTRODUCTION Sequence repetition in Escherichia coli and Salmonella typhimurium (official designation, Salmonella

More information

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p.110-114 Arrangement of information in DNA----- requirements for RNA Common arrangement of protein-coding genes in prokaryotes=

More information

Introduction. Gene expression is the combined process of :

Introduction. Gene expression is the combined process of : 1 To know and explain: Regulation of Bacterial Gene Expression Constitutive ( house keeping) vs. Controllable genes OPERON structure and its role in gene regulation Regulation of Eukaryotic Gene Expression

More information

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18

Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18 Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E REVIEW SESSION Wednesday, September 15 5:30 PM SHANTZ 242 E Gene Regulation Gene Regulation Gene expression can be turned on, turned off, turned up or turned down! For example, as test time approaches,

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

doi: / _25

doi: / _25 Boc, A., P. Legendre and V. Makarenkov. 2013. An efficient algorithm for the detection and classification of horizontal gene transfer events and identification of mosaic genes. Pp. 253-260 in: B. Lausen,

More information

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization. 3.B.1 Gene Regulation Gene regulation results in differential gene expression, leading to cell specialization. We will focus on gene regulation in prokaryotes first. Gene regulation accounts for some of

More information

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION Annu. Rev. Genomics Hum. Genet. 2003. 4:213 35 doi: 10.1146/annurev.genom.4.020303.162528 Copyright c 2003 by Annual Reviews. All rights reserved First published online as a Review in Advance on June 4,

More information

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha

Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha Gene expression in prokaryotic and eukaryotic cells, Plasmids: types, maintenance and functions. Mitesh Shrestha Plasmids 1. Extrachromosomal DNA, usually circular-parasite 2. Usually encode ancillary

More information

Genomics and bioinformatics summary. Finding genes -- computer searches

Genomics and bioinformatics summary. Finding genes -- computer searches Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence

More information

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants

More information

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation Molecular Evolution & the Origin of Variation What Is Molecular Evolution? Molecular evolution differs from phenotypic evolution in that mutations and genetic drift are much more important determinants

More information

Eukaryotic vs. Prokaryotic genes

Eukaryotic vs. Prokaryotic genes BIO 5099: Molecular Biology for Computer Scientists (et al) Lecture 18: Eukaryotic genes http://compbio.uchsc.edu/hunter/bio5099 Larry.Hunter@uchsc.edu Eukaryotic vs. Prokaryotic genes Like in prokaryotes,

More information

Chapter 19. Microbial Taxonomy

Chapter 19. Microbial Taxonomy Chapter 19 Microbial Taxonomy 12-17-2008 Taxonomy science of biological classification consists of three separate but interrelated parts classification arrangement of organisms into groups (taxa; s.,taxon)

More information

Vital Statistics Derived from Complete Genome Sequencing (for E. coli MG1655)

Vital Statistics Derived from Complete Genome Sequencing (for E. coli MG1655) We still consider the E. coli genome as a fairly typical bacterial genome, and given the extensive information available about this organism and it's lifestyle, the E. coli genome is a useful point of

More information

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11 The Eukaryotic Genome and Its Expression Lecture Series 11 The Eukaryotic Genome and Its Expression A. The Eukaryotic Genome B. Repetitive Sequences (rem: teleomeres) C. The Structures of Protein-Coding

More information

Molecular Drive (Dover)

Molecular Drive (Dover) Molecular Drive (Dover) The nuclear genomes of eukaryotes are subject to a continual turnover through unequal exchange, gene conversion, and DNA transposition. Both stochastic and directional processes

More information

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information -

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information - Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes - Supplementary Information - Martin Bartl a, Martin Kötzing a,b, Stefan Schuster c, Pu Li a, Christoph Kaleta b a

More information

Lecture 10: Cyclins, cyclin kinases and cell division

Lecture 10: Cyclins, cyclin kinases and cell division Chem*3560 Lecture 10: Cyclins, cyclin kinases and cell division The eukaryotic cell cycle Actively growing mammalian cells divide roughly every 24 hours, and follow a precise sequence of events know as

More information

Molecular evolution - Part 1. Pawan Dhar BII

Molecular evolution - Part 1. Pawan Dhar BII Molecular evolution - Part 1 Pawan Dhar BII Theodosius Dobzhansky Nothing in biology makes sense except in the light of evolution Age of life on earth: 3.85 billion years Formation of planet: 4.5 billion

More information

Assessing evolutionary relationships among microbes from whole-genome analysis Jonathan A Eisen

Assessing evolutionary relationships among microbes from whole-genome analysis Jonathan A Eisen 475 Assessing evolutionary relationships among microbes from whole-genome analysis Jonathan A Eisen The determination and analysis of complete genome sequences have recently enabled many major advances

More information

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: m Eukaryotic mrna processing Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus: Cap structure a modified guanine base is added to the 5 end. Poly-A tail

More information

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Regulation of Gene Expression

Regulation of Gene Expression Chapter 18 Regulation of Gene Expression Edited by Shawn Lester PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley

More information

A. Incorrect! In the binomial naming convention the Kingdom is not part of the name.

A. Incorrect! In the binomial naming convention the Kingdom is not part of the name. Microbiology Problem Drill 08: Classification of Microorganisms No. 1 of 10 1. In the binomial system of naming which term is always written in lowercase? (A) Kingdom (B) Domain (C) Genus (D) Specific

More information

Cell Division. OpenStax College. 1 Genomic DNA

Cell Division. OpenStax College. 1 Genomic DNA OpenStax-CNX module: m44459 1 Cell Division OpenStax College This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 By the end of this section, you will be

More information

The Gene The gene; Genes Genes Allele;

The Gene The gene; Genes Genes Allele; Gene, genetic code and regulation of the gene expression, Regulating the Metabolism, The Lac- Operon system,catabolic repression, The Trp Operon system: regulating the biosynthesis of the tryptophan. Mitesh

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

UNIT 5. Protein Synthesis 11/22/16

UNIT 5. Protein Synthesis 11/22/16 UNIT 5 Protein Synthesis IV. Transcription (8.4) A. RNA carries DNA s instruction 1. Francis Crick defined the central dogma of molecular biology a. Replication copies DNA b. Transcription converts DNA

More information

Genetics 275 Notes Week 7

Genetics 275 Notes Week 7 Cytoplasmic Inheritance Genetics 275 Notes Week 7 Criteriafor recognition of cytoplasmic inheritance: 1. Reciprocal crosses give different results -mainly due to the fact that the female parent contributes

More information

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology

More information

Understanding Science Through the Lens of Computation. Richard M. Karp Nov. 3, 2007

Understanding Science Through the Lens of Computation. Richard M. Karp Nov. 3, 2007 Understanding Science Through the Lens of Computation Richard M. Karp Nov. 3, 2007 The Computational Lens Exposes the computational nature of natural processes and provides a language for their description.

More information

GCD3033:Cell Biology. Transcription

GCD3033:Cell Biology. Transcription Transcription Transcription: DNA to RNA A) production of complementary strand of DNA B) RNA types C) transcription start/stop signals D) Initiation of eukaryotic gene expression E) transcription factors

More information

15.2 Prokaryotic Transcription *

15.2 Prokaryotic Transcription * OpenStax-CNX module: m52697 1 15.2 Prokaryotic Transcription * Shannon McDermott Based on Prokaryotic Transcription by OpenStax This work is produced by OpenStax-CNX and licensed under the Creative Commons

More information

CHAPTER : Prokaryotic Genetics

CHAPTER : Prokaryotic Genetics CHAPTER 13.3 13.5: Prokaryotic Genetics 1. Most bacteria are not pathogenic. Identify several important roles they play in the ecosystem and human culture. 2. How do variations arise in bacteria considering

More information

Genômica comparativa. João Carlos Setubal IQ-USP outubro /5/2012 J. C. Setubal

Genômica comparativa. João Carlos Setubal IQ-USP outubro /5/2012 J. C. Setubal Genômica comparativa João Carlos Setubal IQ-USP outubro 2012 11/5/2012 J. C. Setubal 1 Comparative genomics There are currently (out/2012) 2,230 completed sequenced microbial genomes publicly available

More information

Genome reduction in prokaryotic obligatory intracellular parasites of humans: a comparative analysis

Genome reduction in prokaryotic obligatory intracellular parasites of humans: a comparative analysis International Journal of Systematic and Evolutionary Microbiology (2004), 54, 1937 1941 DOI 10.1099/ijs.0.63090-0 Genome reduction in prokaryotic obligatory intracellular parasites of humans: a comparative

More information

Campbell Biology 10. A Global Approach. Chapter 20 The Evolution of Genomes

Campbell Biology 10. A Global Approach. Chapter 20 The Evolution of Genomes Lecture on General Biology 2 Campbell Biology 10 A Global Approach th edition Chapter 20 The Evolution of Genomes Chul-Su Yang, Ph.D., chulsuyang@hanyang.ac.kr Infection Biology Lab., Dept. of Molecular

More information

PROTEIN SYNTHESIS INTRO

PROTEIN SYNTHESIS INTRO MR. POMERANTZ Page 1 of 6 Protein synthesis Intro. Use the text book to help properly answer the following questions 1. RNA differs from DNA in that RNA a. is single-stranded. c. contains the nitrogen

More information

Bergey s Manual Classification Scheme. Vertical inheritance and evolutionary mechanisms

Bergey s Manual Classification Scheme. Vertical inheritance and evolutionary mechanisms Bergey s Manual Classification Scheme Gram + Gram - No wall Funny wall Vertical inheritance and evolutionary mechanisms a b c d e * * a b c d e * a b c d e a b c d e * a b c d e Accumulation of neutral

More information

chapter 5 the mammalian cell entry 1 (mce1) operon of Mycobacterium Ieprae and Mycobacterium tuberculosis

chapter 5 the mammalian cell entry 1 (mce1) operon of Mycobacterium Ieprae and Mycobacterium tuberculosis chapter 5 the mammalian cell entry 1 (mce1) operon of Mycobacterium Ieprae and Mycobacterium tuberculosis chapter 5 Harald G. Wiker, Eric Spierings, Marc A. B. Kolkman, Tom H. M. Ottenhoff, and Morten

More information

Bacterial Genetics & Operons

Bacterial Genetics & Operons Bacterial Genetics & Operons The Bacterial Genome Because bacteria have simple genomes, they are used most often in molecular genetics studies Most of what we know about bacterial genetics comes from the

More information

BME 5742 Biosystems Modeling and Control

BME 5742 Biosystems Modeling and Control BME 5742 Biosystems Modeling and Control Lecture 24 Unregulated Gene Expression Model Dr. Zvi Roth (FAU) 1 The genetic material inside a cell, encoded in its DNA, governs the response of a cell to various

More information

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre PhD defense Chromosomal rearrangements in mammalian genomes : characterising the breakpoints Claire Lemaitre Laboratoire de Biométrie et Biologie Évolutive Université Claude Bernard Lyon 1 6 novembre 2008

More information

This is a repository copy of Microbiology: Mind the gaps in cellular evolution.

This is a repository copy of Microbiology: Mind the gaps in cellular evolution. This is a repository copy of Microbiology: Mind the gaps in cellular evolution. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/114978/ Version: Accepted Version Article:

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

The use of gene clusters to infer functional coupling

The use of gene clusters to infer functional coupling Proc. Natl. Acad. Sci. USA Vol. 96, pp. 2896 2901, March 1999 Genetics The use of gene clusters to infer functional coupling ROSS OVERBEEK*, MICHAEL FONSTEIN, MARK D SOUZA*, GORDON D. PUSCH*, AND NATALIA

More information

Introduction to Molecular and Cell Biology

Introduction to Molecular and Cell Biology Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the molecular basis of disease? What

More information

Comparing whole genomes

Comparing whole genomes BioNumerics Tutorial: Comparing whole genomes 1 Aim The Chromosome Comparison window in BioNumerics has been designed for large-scale comparison of sequences of unlimited length. In this tutorial you will

More information

Scientists have been measuring organisms metabolic rate per gram as a way of

Scientists have been measuring organisms metabolic rate per gram as a way of 1 Mechanism of Power Laws in Allometric Scaling in Biology Thursday 3/22/12: Scientists have been measuring organisms metabolic rate per gram as a way of comparing various species metabolic efficiency.

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

RNA Synthesis and Processing

RNA Synthesis and Processing RNA Synthesis and Processing Introduction Regulation of gene expression allows cells to adapt to environmental changes and is responsible for the distinct activities of the differentiated cell types that

More information

Bioinformatics Chapter 1. Introduction

Bioinformatics Chapter 1. Introduction Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!

More information

AS A SERVICE TO THE RESEARCH COMMUNITY, GENOME BIOLOGY PROVIDES A 'PREPRINT' DEPOSITORY

AS A SERVICE TO THE RESEARCH COMMUNITY, GENOME BIOLOGY PROVIDES A 'PREPRINT' DEPOSITORY http://genomebiology.com/2002/3/12/preprint/0011.1 This information has not been peer-reviewed. Responsibility for the findings rests solely with the author(s). Deposited research article MRD: a microsatellite

More information

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology 2012 Univ. 1301 Aguilera Lecture Introduction to Molecular and Cell Biology Molecular biology seeks to understand the physical and chemical basis of life. and helps us answer the following? What is the

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

This document describes the process by which operons are predicted for genes within the BioHealthBase database.

This document describes the process by which operons are predicted for genes within the BioHealthBase database. 1. Purpose This document describes the process by which operons are predicted for genes within the BioHealthBase database. 2. Methods Description An operon is a coexpressed set of genes, transcribed onto

More information

Microbial Genetics, Mutation and Repair. 2. State the function of Rec A proteins in homologous genetic recombination.

Microbial Genetics, Mutation and Repair. 2. State the function of Rec A proteins in homologous genetic recombination. Answer the following questions 1. Define genetic recombination. Microbial Genetics, Mutation and Repair 2. State the function of Rec A proteins in homologous genetic recombination. 3. List 3 types of bacterial

More information