Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/1/18

Similar documents
Outline. Genome Evolution. Genome. Genome Architecture. Constraints on Genome Evolution. New Evolutionary Synthesis 11/8/16

Genetic Variation: The genetic substrate for natural selection. Horizontal Gene Transfer. General Principles 10/2/17.

Introduction to Molecular and Cell Biology

2012 Univ Aguilera Lecture. Introduction to Molecular and Cell Biology

Organization of Genes Differs in Prokaryotic and Eukaryotic DNA Chapter 10 p

Genomes and Their Evolution

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Principles of Genetics

TE content correlates positively with genome size

The Gene The gene; Genes Genes Allele;

Molecular evolution - Part 1. Pawan Dhar BII

Molecular Evolution & the Origin of Variation

Molecular Evolution & the Origin of Variation

Lecture Notes: BIOL2007 Molecular Evolution

Frequently Asked Questions (FAQs)

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

FUNDAMENTALS OF MOLECULAR EVOLUTION

Multiple Choice Review- Eukaryotic Gene Expression

Chapter 18 Active Reading Guide Genomes and Their Evolution

Eukaryotic vs. Prokaryotic genes

The Eukaryotic Genome and Its Expression. The Eukaryotic Genome and Its Expression. A. The Eukaryotic Genome. Lecture Series 11

Mole_Oce Lecture # 24: Introduction to genomics

The evolution of complexity I!! Iain Mathieson!

Honors Biology Reading Guide Chapter 11

Regulation of Gene Expression

Chapter 15 Active Reading Guide Regulation of Gene Expression

Full file at CHAPTER 2 Genetics

Biology 105/Summer Bacterial Genetics 8/12/ Bacterial Genomes p Gene Transfer Mechanisms in Bacteria p.

HORIZONTAL TRANSFER IN EUKARYOTES KIMBERLEY MC GRAIL FERNÁNDEZ GENOMICS

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

2. What was the Avery-MacLeod-McCarty experiment and why was it significant? 3. What was the Hershey-Chase experiment and why was it significant?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Bio 119 Bacterial Genomics 6/26/10

Big Idea 3: Living systems store, retrieve, transmit and respond to information essential to life processes. Tuesday, December 27, 16

November 13, 2009 Bioe 109 Fall 2009 Lecture 20 Evolutionary Genomics

SEQUENCE DIVERGENCE,FUNCTIONAL CONSTRAINT, AND SELECTION IN PROTEIN EVOLUTION

CHAPTER : Prokaryotic Genetics

Genetics 275 Notes Week 7

Introduction to molecular biology. Mitesh Shrestha

Genome Evolution: Overview

BIOLOGY STANDARDS BASED RUBRIC

Evolutionary Genomics and Proteomics

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

Genome Sequences and Evolution

Processes of Evolution

Related Courses He who asks is a fool for five minutes, but he who does not ask remains a fool forever.

Biology. Slide 1 of 36. End Show. Copyright Pearson Prentice Hall

8/23/2014. Phylogeny and the Tree of Life

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

Understanding relationship between homologous sequences

Biology 112 Practice Midterm Questions

Lecture 20 DNA Repair and Genetic Recombination (Chapter 16 and Chapter 15 Genes X)

Special Topics on Genetics

Lecture 7 Mutation and genetic variation

BME 5742 Biosystems Modeling and Control

1. In most cases, genes code for and it is that

Processes of Evolution

Lecture 18 June 2 nd, Gene Expression Regulation Mutations

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Reading Assignments. A. Genes and the Synthesis of Polypeptides. Lecture Series 7 From DNA to Protein: Genotype to Phenotype

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

ADVANCED PLACEMENT BIOLOGY

Biology Assessment. Eligible Texas Essential Knowledge and Skills

Computational Biology: Basics & Interesting Problems

3.B.1 Gene Regulation. Gene regulation results in differential gene expression, leading to cell specialization.

STAAR Biology Assessment

Campbell Biology 10. A Global Approach. Chapter 20 The Evolution of Genomes

GCD3033:Cell Biology. Transcription

Genomes and Their Evolution

MOBILE ELEMENTS AND EVOLUTION OF MOLECULAR REGULATORY SYSTEMS. Evelina Daskalova*

I. Molecules and Cells: Cells are the structural and functional units of life; cellular processes are based on physical and chemical changes.

WHERE DOES THE VARIATION COME FROM IN THE FIRST PLACE?

Classification and Viruses Practice Test

REVIEW SESSION. Wednesday, September 15 5:30 PM SHANTZ 242 E

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

CHAPTER 13 PROKARYOTE GENES: E. COLI LAC OPERON

GENE REGULATION AND PROBLEMS OF DEVELOPMENT

The nature of genomes. Viral genomes. Prokaryotic genome. Nonliving particle. DNA or RNA. Compact genomes with little spacer DNA

PROTEIN SYNTHESIS INTRO

Name: SBI 4U. Gene Expression Quiz. Overall Expectation:

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3

I. Molecules & Cells. A. Unit One: The Nature of Science. B. Unit Two: The Chemistry of Life. C. Unit Three: The Biology of the Cell.

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Bacterial Genetics & Operons

Flow of Genetic Information

Evolutionary Analysis, 5e (Herron/Freeman) Chapter 2 The Pattern of Evolution

A.P. Biology Lecture Notes Unit 1A - Themes of Life

GACE Biology Assessment Test I (026) Curriculum Crosswalk

Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species

BIOLOGY. Genomes and Their Evolution CAMPBELL. Reece Urry Cain Wasserman Minorsky Jackson

Comparative genomics: Overview & Tools + MUMmer algorithm

There are 3 parts to this exam. Use your time efficiently and be sure to put your name on the top of each page.

Origins of Life. Fundamental Properties of Life. Conditions on Early Earth. Evolution of Cells. The Tree of Life

JUNK DNA: EVIDENCE FOR EVOLUTION OR DESIGN?

Warm-Up. Explain how a secondary messenger is activated, and how this affects gene expression. (LO 3.22)

CHAPTER 23 THE EVOLUTIONS OF POPULATIONS. Section C: Genetic Variation, the Substrate for Natural Selection

Control of Gene Expression

Name: Class: Date: ID: A

PHYLOGENY AND SYSTEMATICS

Transcription:

Genome Evolution Outline 1. What: Patterns of Genome Evolution Carol Eunmi Lee Evolution 410 University of Wisconsin 2. Why? Evolution of Genome Complexity and the interaction between Natural Selection and Genetic Drift Genome Definition: The entirety of an organism's hereditary information, encoded in DNA or in RNA, for many types of viruses. The genome includes both the genes and noncoding sequences of the DNA/RNA Genome Architecture The totality of non-random arrangements of functional elements (genes, regulatory regions etc.) in the genome Genome architecture is highly variable across taxa New Evolutionary Synthesis Comparative genomics has the potential to measure the strength of constraints on different classes of sites in genomes and to elucidate the biological nature of these constraints. Genome comparisons also help to address higher-level questions, including the degree to which constraints act on gene repertoires, genome architecture and the evolution rate itself. The avalanche of systems biology data allows researchers to ask new, qualitative questions, such as how do constraints affect metabolic fluxes and the molecular phenome Constraints on Genome Evolution Most protein coding genes evolve under purifying selection of widely varying strength, to preserve protein function, and few evolve under positive selection Strength of constraints also differs between coding + non-coding but regulatory sequences versus noncoding (and non-regulatory) sequences ***Note that non-coding sequence includes both nonfunctional and functional (e.g. regulatory) regions 1

Purifying Selection = Negative Selection Smaller Population Size Evolutionary constraints (purifying selection to maintain function) acting on different regions of the genome Acts to select out deleterious mutations in coding or regulatory sequences (noncoding, nonfunctional) Tends to maintain constancy of DNA sequences... More constant than would be expected with new mutations and if genetic drift were acting alone (K a /K s < 1) (transcribed sequence) (regulatory sequence) Smaller Population Size (noncoding, nonfunctional) (transcribed sequence) Evolutionary constraints acting on different regions of the genome 1. Least constrained: Junk DNA noncoding and nonfunctional sequence 2. Less constrained: Introns - noncoding, but affect splice sites in genes and exon shuffling 3. Weaker constraint: Noncoding, but regulatory sequence 4. Most constrained: Sequences encoding structural RNAs and nonsynonymous sites in protein-coding sequences General Principles Genome Architecture Most conserved feature of Prokaryotes is the operon Gene Order: Prokaryotic gene order is not conserved (aside from order within the operon), whereas in Eukaryotes gene order tends to be conserved across taxa Intron-exon genomic organization: The distinctive feature of eukaryotic genomes that sharply separates them from prokaryotic genomes is the presence of spliceosomal introns that interrupt protein-coding genes 2

Small vs. Large Genomes 1. Compact, relatively small genomes of viruses, archaea, bacteria (typically, <10Mb), and many unicellular eukaryotes (typically, <20 Mb). In these genomes, protein-coding and RNAcoding sequences occupy most of the genomic sequence. 2. Expansive, large genomes of multicellular and some unicellular eukaryotes (typically, >100 Mb). In these genomes, the majority of the nucleotide sequence is non-coding, and contain introns, transposons, etc. Genome Compactness and Constraint Evolutionary constraints on compact genomes, particularly those of prokaryotes, are much stronger than the constraints on the genomes of multicellular eukaryotes (median K a/k s values for prokaryotes and multicellular organisms are typically 0.01 0.1 and 0.1 0.5, respectively). In viruses and prokaryotes, nearly all genomic sites are evolutionarily constrained, as most of the genome is functional Genome Compactness and Constraint Multicellular Eukaryotes Noncoding regions constitute only 10 15% of the genomes of most free-living prokaryotes, and a considerable fraction of these sequences encompasses regulatory elements that are substantially constrained in their evolution. The genomes of most viruses are even more compact, with almost all of the genome sequence taken up by protein coding genes. The estimated fractions (%) of constrained nucleotides in a genome differ substantially even between animals In Drosophila melanogaster, ~70% of sites in the genome, including ~65% of the non-coding sites, seem to be subject to selection, whereas in mammals this fraction is estimated at only 5 6% or even ~3% The absolute numbers of sites subject to selection in these animal genomes of widely different size are quite close Gene Order Conservation Genome size vs # Protein-coding Genes More synteny in Eukaryotes than in Prokaryotes Comparison of gene orders between eukaryotic genomes reveals considerable conservation of synteny over long evolutionary spans (hundreds of millions of years), e.g., among vertebrates or insects. Generally, gene # increases slowly with genome size (most increase is noncoding & nonfunctional DNA) 3

Introns The average number of introns per gene in most multicellular species is 4-7, whereas the average number for most unicellular eukaryotes is less than two Transposon abundance increases with genome size Below a threshold genome size of 10 Mb, introns are very rare and above 10Mb, they approach an asymptote of about seven per gene Fig. 3. The relationship between genome size (in base pairs, bp) and mean number of introns, and mean intron size Genome Size Variation Vertebrate genomes are veritable junkyards of selfish genetic elements where only a small fraction of the genetic material is dedicated to encoding biologically relevant information. (Koonin 2009, your reading) Microbial genomes are more compact, with most of the genetic material assigned to distinct biological functions. There is a positive correlation between body size and genome size and complexity WHY??? Genome size scales roughly with body size So what might be the cause of genome size variation? Why do larger organisms, on average, tend to have larger and more complex genomes? 4

Genome Size Variation Genome size generally increases as body size increases In general, larger organisms tend to have larger genomes and greater genome complexity (though there are exceptions) As we go to higher trophic levels (up the food web), organisms become larger and fewer (smaller effective population size) genome size (base pairs) Population size declines at higher trophic levels Population size declines Body size scales inversely with population size Large animals are less abundant (Damuth 1981) In general, total biomass declines ~10-fold with increasing trophic level and because average body size increases at higher levels in the food chain, total population size must decline even more sharply So what are the evolutionary consequences of this inverse relationship between body size and population size? 5

Genetic Drift and Natural Selection Recall our Discussion of Natural Selection in the presence of Genetic Drift Because of the randomness introduced by Genetic Drift, Natural Selection is less efficient when there is genetic drift Thus, Natural Selection is more efficient in larger populations, and less effective in smaller populations Lynch & Connery. 2003. Science 302: 1401-1404 Lynch & Connery s argument on causes of evolution of genome size and complexity Transitions from prokaryotes to unicellular eukaryotes to multicellular eukaryotes are associated with orders-of-magnitude reductions in population size Reduced population size increases the power of Genetic Drift, weakening the effect of natural selection to remove various genomic features that would tend to proliferate (such as transposons, introns, gene families) purifying selection Lynch & Connery s argument on causes of evolution of genome size and complexity Thus, purifying selection would tend to be more effective in organisms with large population size, which are also organisms that tend to be small in body size This action of purifying selection would result in smaller and more efficient genomes in organisms that have large population size Negative selection or purifying selection is the selective removal of alleles that are deleterious. New alleles that arise that alter the phenotype would be purged 6

Effective Population Size (very rough estimates) Prokaryotes: N e is generally >10 8 Unicellular eukaryotes: ~10 7-10 8 Invertebrates: 10 5-10 6 Vertebrates: 10 4-10 5 Larger organisms have greater genome complexity Larger organisms with small population sizes have much more complex genomes, with introns, vast amounts of non-coding DNA, transposons, etc. Because in smaller populations, natural selection would be less efficient, and less likely to take out new mutations that arise, even if they might be mildly deleterious Duplicated genes last longer in smaller populations There is a clear tendency for the half-life of duplicate genes to increase with genome size, again with a continuous transition between prokaryotes and eukaryotes (Fig. 2). The implication is that organisms that rise in trophic level in a food web would tend to acquire a more complex genome (due to lower efficiency of selection acting) Thus, by correlation, the ability of a newly arisen gene to survive the accumulation of mutations increases with decreasing effective population size. Duplicated genes last longer in smaller populations Much of the increase in gene number in multicellular species may not have been driven by adaptive processes But, rather as a passive response to reduced population size (and reduced purifying selection) more conducive to duplicate-gene preservation by subfunctionalization (the subfunctionalized copies would all have to be retained) Introns The average number of introns per gene in most multicellular species is 4-7, whereas the average number for most unicellular eukaryotes is less than two. Below a threshold genome size of 10 Mb, introns are very rare and above 10Mb, they approach an asymptote of about seven per gene Fig. 3. The relationship between genome size (in base pairs, bp) and mean number of introns, and mean intron size 7

Transposon abundance increases with genome size Example: Genome Composition in Humans An example of genome architecture of a larger organism, which had a small population size for most of its evolutionary history (~100,000 individuals) (our current huge population size is very very recent, starting 7000 yrs ago, and an anomaly, atypical of organisms our size) Components of the Human Genome Components of the Human Genome transposons Less than 1.5% of the human genome consists of the suspected ~30,000 protein-coding sequences. Less than 1.5% of the human genome consists of the suspected ~30,000 protein-coding sequences. By contrast, a large majority is made up of non-coding sequences such as introns (almost 26%) and (mostly defunct) transposable elements (nearly 45%). Synergy between sequence and size in Large-scale genomics T. Ryan Gregory.2005. Nature Reviews Genetics 6, 699-708 In contrast, a large portion of the human genome is made up of non-coding sequences such as introns (almost 26%) and (mostly defunct) transposable elements (nearly 45%). Features that arose by accident could then be subjected to selectionà opportunities for evolutionary novelty Although the mechanisms responsible for the initial restructuring of eukaryotic genomes may have been nonadaptive in nature, this would not preclude the new features from undergoing selection and then contributing to phenotypic evolution Introns sustained a reliable mechanism for alternative splicing, and in at least some lineages, they provide an orientation mechanism for the surveillance of defective mrnas Subfunctionalization of duplicated genes provides a mechanism for eliminating pleiotropic constraints on ancestral genes, thereby opening up previously inaccessible evolutionary pathways Some Critiques of Lynch s Argument Did not take phylogenetic history into account; the negative correlation between effective population size and genome size might simply be a result of closely related taxa having more similar genome size. (although, archaea and bacteria having similar architecture supports his argument) There are probably other factors also operating For example, parasitic species often have small effective population sizes but also have small genomes 8

Questions: Despite other potential factors that could contribute to the evolution of genome size Lynch s argument remains a very useful null model against which to test for evidence of other factors affecting genome size evolution What are the relationships between body size, population size and genome size and architecture? What are the potential causes of the evolution of genome size and architecture? What are some key differences in architecture between viral, prokaryotic, and eukaryotic genomes? Why should bacterial and archaeal genomes share genomic features when they are not evolutionary more closely related to one another than to eukaryotes? Questions: What are some distinctive features of prokaryotic genomes? Why is gene order not conserved in prokaryotes? Why do eukaryotes have a lot of introns, transposons, etc.? Why do organisms higher in trophic food webs tend to have larger and more complex genomes? What is purifying selection, and what does it have to do with genome size evolution? 2. Genome size differences from prokaryotes to multicellular eukaryotes are mostly attributable to: (a) The amount of "junk" (non-coding and nonfunctional) DNA in the genome (b) The amount of coding sequences (c) The number of trna coding genes (d) "Junk" DNA being selectively advantageous in prokaryotes (e) "Junk" DNA being selectively advantageous in eukaryotes 3. In very large populations, novel genomic elements that that are introduced randomly (such as introns, transposons, gene duplicates) tend to be: (a) Selectively advantageous (b) Selectively neutral (c) Selectively removed (via purifying selection) (d) Pleiotropic (e) Under epigenetic control 5. Which of the following is most TRUE regarding the evolution of genome architecture? (a) Over evolutionary time, natural selection would tend to favor the evolution of larger and more complex genomes. (b) Bacterial operons are analogous to the intron-exon organization in eukaryotes (c) Eukaryotic genomes tend to have introns, transposons, and other non-coding genomic elements due to the larger body sizes of eukaryotes (d) Much of viral and bacterial genomes are under greater evolutionary constraint (than eukaryotes), consisting mostly of coding sequences experiencing purifying selection (e) Gene order tends to be highly conserved in bacteria (compared to eukaryotes) due to operons 9

6. Which of the following is LEAST likely to lead to the evolution of increased genome size? (a) Gene duplications (b) Small population size (c) Transposons (d) Genetic Drift (e) Purifying selection answers 2A 3C 5D 6E 1. Genome size differences from prokaryotes to eukaryotes are mostly attributable to: (a) The amount of "junk" (non-coding and nonfunctional) DNA in the genome (b) The amount of coding sequences (c) The number of trna coding genes (d) "Junk" DNA being selectively advantageous in prokaryotes (e) "Junk" DNA being selectively advantageous in eukaryotes 2. Which of the following is most TRUE regarding the evolution of genome architecture? (a) Over evolutionary time, natural selection would tend to favor the evolution of larger and more complex genomes (b) Bacterial operons are analogous to the intron-exon organization in eukaryotes (c) Eukaryotic genomes tend to have introns, transposons, and other non-coding genomic elements due to the larger body sizes of eukaryotes (d) Much of viral and bacterial genomes are under greater evolutionary constraint (than eukaryotes) because a greater proportion of their genomes consist of coding sequences, which experience purifying selection (e) Gene order tends to be highly conserved in bacteria (compared to eukaryotes) due to operons and horizontal gene transfer Components of the Human Genome 1A 2D Protein-coding genes: Although most prokaryotic chromosomes consist almost entirely of protein-coding genes 86, such elements make up a small fraction of most eukaryotic genomes (see figure). As a prime example, the human genome might contain as few as 20,000 genes, comprising less than 1.5% of the total genome sequence 16, 82. Introns: Shortly after their discovery, the non-coding intervening sequences within coding genes (introns) were suggested to account for the pronounced discrepancy between gene number and genome size 7. It has also recently been suggested that most non-coding DNA in animals (but not plants) is intronic, which would imply that most of the genome is transcribed even though protein-coding regions represent a tiny minority 107, 108. At the very least, introns were found to account for more than a quarter of the draft human sequence 16. Over a broad taxonomic scale, intron size and genome size are positively correlated 109, although within genera a correlation might (for example, Drosophila 110 ) or might not (for example, Gossypium 111 ) be observed. 10

Components of the Human Genome Pseudogenes: Non-functional copies of coding genes, the original meaning of the term 'junk DNA', were once thought to explain variation in genome size 4. However, it is now apparent that even in combination, 'classical pseudogenes' (direct DNA to DNA duplicates), 'processed pseudogenes' (copies that are reverse transcribed back into the genome from RNA and therefore lack introns) and 'Numts' (nuclear pseudogenes of mitochondrial origin) comprise a relatively small portion of mammalian genomes. The human genome is estimated to contain about 19,000 pseudogenes 46. Transposable elements: In eukaryotes, transposable elements are divided into two general classes according to their mode of transposition. Class I elements transpose through an RNA intermediate. This class comprises long interspersed nuclear elements (LINEs), endogenous retroviruses, short interspersed nuclear elements (SINEs) and long terminal repeat (LTR) retrotransposons. Class II elements transpose directly from DNA to DNA, and include DNA transposons and miniature inverted repeat transposable elements (MITEs). 11