Mole_Oce Lecture # 24: Introduction to genomics

Save this PDF as:

Size: px
Start display at page:

Download "Mole_Oce Lecture # 24: Introduction to genomics"


1 Mole_Oce Lecture # 24: Introduction to genomics

2 DEFINITION: Genomics: the study of genomes or he study of genes and their function. Genomics (1980s):The systematic generation of information about genes and genomes Functional genomics:the systematic generation and analysis of information about what genes do. The omics are a la mode : transcriptomics, proteomics, peptidomics, metabolomics, ecogenomics, toxicogenomics, pharmacogenomics, etc

3 Bio-informatics Structural genomics -DNA sequences (nucleotide order) -gene structure (predicted or from genomecdna comparison) (gene discovery) -gene organization (gene order,chromosomal localization; mapping) -chromosomal organization -genome structure -macrovariation (gene complement,diversity; genome organization) -microvariation interspecific -microvariation intraspecific (polymorphisms) -regulatory sequences -inter-genic regions -repetitive elements,mobile elements Functional genomics -gene regulation -expression of RNA (measured via cdna) -known versus unknown ( gene discovery ) -microarray (cdna vs.oligo)versus other methods -proteomics -protein function (high-throughput analysis of e.g.protein-protein interactions)? -structural biology (structure-function relationships) -gene pathways and networks and their regulation -effects of gene deletion (knock-out,knockdown,saturation mutagenesis)

4 Organisms are networks of genes, which make networks of proteins, which regulate genes, and so on ad infinitum In genomics, one try to escape the reductionist approaches that focus on the component part. The new challenge is to analyze all the components at once. Genomics allow also a constant cross-fertilization between genome-wide studies and more focused studies. The biologists enter into an new landscape of data. For the first time in Biology, data acquisition predate the analysis of data. Data are a new motor of innovation.

5 6000 genes 19,000 genes 13,600 genes

6 The C-value paradox: why would ma ny geno mes have vast amounts of extra DNA considering the actual inf ormational needs of the organism? C-value ranges over fold!! Some amoeba have 200 times more DNA than us.. The PARADOX: Why would many genomes have vast amounts of extra DNA considering the actual informational needs of the organism?

7 The adaptive versus junk DNA theories: The adaptive theories postulate an adaptive function for this extra DNA given that DNA abundance, rather than its information content, can have a direct and significant effect on phenotype. For instance, a larger genome size could be adaptive because it directly or indirectly increases nuclear and cellular volumes, helps to buffer fluctuations in the concentration of regulatory proteins, or protects coding DNA from mutation. According to these hypotheses, the observed variation in genome size reflects different adaptive needs or the efficacy of The adaptive theories natural postulate selection an adaptive in different function organisms. for this extra DNA given that DNA abundance, rather than its information content, can have a direct and significant effect on phenotype According to the junk DNA or the selfish DNA - theories, purifying selection against the accumulation of useless DNA is often not strong enough completely to counteract the steady stream of DNA addition through transposition and pseudogene formation. The final genome size is then set at the highest tolerable maximum which depends on the particular ecological and developmental needs of the organism

8 The forces affecting genome-size evolution. DNA-length mutants are created through a variety of mechanisms shown at the top, producing mutational pressure either to expand or contract the genome size. Some of these mutations affect the phenotype and undergo natural selection. Some might have negligible selective effects and are governed primarily by genetic drift. The combined interplay of all these forces affects genome size.

9 Transposition rates are generally higher than excision rates, so that TEs increase genome size (eg % of mammals genome is made of TEs) In the 240 kb of contiguous sequence around the adh1 gene they found 23 copies of TEs belonging to 11 families of retrotransposons. These 23 copies of retrotransposons accounted for over 160 kb. Importantly, the LTR analysis demonstrated that all elements have transposed in the past 6 Myr, with most jumping in the past 3 Myr. Assuming that the adh1 region is representative of the maize genome in general (and there is no reason to believe otherwise), this result implies that the maize genome has grown by 50%, from 1200 Mbp to 2400 Mbp, in the past 3 Myr. How can we interpret these results? One straightforward explanation is that the transposition frequency in maize has increased substantially in the past 3 Myr. Alternatively, it is also possible that the fixation probability of retrotransposons has changed. For example, natural selection against genome-size growth might keep TEs from fixation in maize relatives, but not in maize itself. SanMiguel et al. 1998

10 Genome diversity in microbial eukaryotes: rdna arrangements Sub-telomeric rdnas: Giardia G. lamblia genome contains,60 copies of a 5.6-Kb rdna unit that is organized in tandem arrays near the telomeres of at least six chromosomes In the microsporidian E. cuniculi, each chromosome contains two subtelomeric rdna units located,15 Kb upstream of the chromosome The rdna genes are located on as many as 200 copies of a circular plasmidlike molecule (episomal elements), in contrast to the tandem arrays found in many eukaryotic genomes.

11 Genome diversity in microbial eukaryotes: Giardia Dual Genomes the micronuclear germline genome is involved in CONJUGATION, whereas the somatic macronuclear genome is the site of the majority of transcription In species of Tetrahymena, fragmentation of as few as five micronuclear chromosomes produces up to 200 unique molecules in the macronucleus, each of which is amplified 60 times. In some spirotrichs, 95% or more of the micronuclear sequence is eliminated during the development of the macronucleus and the 120 micronuclear chromosomes fragment into as many as different genesized chromosomes in the macronucleus. Furthermore, each of these highly processed macronuclear chromosomes is then amplified times.

12 Genome diversity in microbial eukaryotes: Organellar Genomes kinetoplastid mitochondrial genomes exist as concatenated mini- and maxicircles. Some of the maxicircles contain incomplete genes that require RNA editing to produce open reading frames, and at least part of the RNA editing is templated by sequences on minicircles Giardia the Amoebidium mitochondrial genomes contains several hundred distinct Kb linear chromosomes. These chromosomes fall into three categories: (i) small molecules with no identified coding regions, (ii) medium-sized molecules that encode a single gene, and (iii) larger molecules containing multiple genes. UNIGENIC minicircles are a unique genome structure that has been reported in the chloroplasts of peridinean dinoflagellates. The chloroplast genes of these dinoflagellates occur on 2 3-Kb minicircles, which contain generally only one gene plus an origin of replication and a promoter.