Bioinformatics tools to analyze complex genomes. Yves Van de Peer Ghent University/VIB

Similar documents
Supplementary Figure 1. Number of CC- and TIR- type NBS- LRR genes and presence of mir482/2118 on sequenced plant genomes.

Supplementary Figure 3

SUPPLEMENTARY INFORMATION

Evaluation of Genome Sequencing Quality in Selected Plant Species Using Expressed Sequence Tags

Cao, J, K Schneeberger, S Ossowski, et al Whole genome sequencing of multiple Arabidopsis thaliana populations. Nat Genet 43:

Supporting Information

Plant Genome Sequencing

SUPPLEMENTARY INFORMATION

Stage 1: Karyotype Stage 2: Gene content & order Step 3

Supplementary Material

Potato Genome Analysis

Whole-genome duplications and paleopolyploidy

Evolution by duplication: paleopolyploidy events in plants reconstructed by deciphering the evolutionary history of VOZ transcription factors

Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous Paleogene boundary

Supplementary Information for: The genome of the extremophile crucifer Thellungiella parvula

Eud! Mag! Chl! Cer! Mon! outgroup! g 88. Eud! Cer! <50. Chl! Mag! Mon! outgroup! Eud! Mon! Cer! Mag! Chl! outgroup!

Genome-wide discovery of G-quadruplex forming sequences and their functional

Genome-wide Identification of Lineage Specific Genes in Arabidopsis, Oryza and Populus

Miloš Duchoslav and Lukáš Fischer *

Jill M Duarte 1, P Kerr Wall 1,5, Patrick P Edger 2, Lena L Landherr 1, Hong Ma 1,4, J Chris Pires 2, Jim Leebens-Mack 3, Claude W depamphilis 1*

SUPPLEMENTARY INFORMATION

Supplemental Figure 1. Comparison of Tiller Bud Formation between the Wild Type and d27. (A) and (B) Longitudinal sections of shoot apex in wild-type

AtTIL-P91V. AtTIL-P92V. AtTIL-P95V. AtTIL-P98V YFP-HPR

Supplemental Data. Vanneste et al. (2015). Plant Cell /tpc

Paleo-evolutionary plasticity of plant disease resistance genes

Big Questions. Is polyploidy an evolutionary dead-end? If so, why are all plants the products of multiple polyploidization events?

Browsing Genomic Information with Ensembl Plants

Impact of recurrent gene duplication on adaptation of plant genomes

Supplemental Figure 1. Comparisons of GC3 distribution computed with raw EST data, bi-beta fits and complete genome sequences for 6 species.

Fractionation resistance of duplicate genes following whole genome duplication in plants as a function of gene ontology category and expression level

USDA-DOE Plant Feedstock Genomics for Bioenergy

In 1996, when the research plant community decided to determine

(Paleo)Polyploidy When Things Get Bigger

PaleoGenomics & Evolution (group PaleoEVO). UMR1095 INRA UBP Génétique, Diversité & Ecophysiologie des Céréales

Identification of new members of the MAPK gene family in plants shows diverse conserved domains and novel activation loop variants

Supplemental Data. Yang et al. (2012). Plant Cell /tpc

The Origin of Species

How to detect paleoploidy?

Chapter 1-Plants in Our World

Phylogenetic Comparison of F-Box (FBX) Gene Superfamily within the Plant Kingdom Reveals Divergent Evolutionary Histories Indicative of Genomic Drift

Lineage specific conserved noncoding sequences in plants

gi (NCBI databse: ES frame +2 (NCBI: databse

PLAZA: A Comparative Genomics Resource to Study Gene and Genome Evolution in Plants W

SUPPLEMENTARY MATERIAL SUPPLEMENTARY TABLES

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Revision Based on Chapter 19 Grade 11

A greedy, graph-based algorithm for the alignment of multiple homologous gene lists

P OLYPLOIDY AND ANGIOSPERM DIVERSIFICATION 1

South Green Bioinformatics activities at CIRAD

Charles Darwin ( ) Sailed around the world

Host_microbe_PPI - R package to analyse intra-species and interspecies protein-protein interactions in the model plant Arabidopsis thaliana

Accepted Article. Received Date : 19-Oct Accepted Date : 25-Jan-2017

Concerted divergence after gene duplication in Polycomb Repressor. complexes

PLANT VARIATION AND EVOLUTION

Evolution Problem Drill 09: The Tree of Life

The Origin of Species

Scientific Names of Common Plants/Trees Scientific Names - Fruits and Vegetables

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Neofunctionalization within the Omp85 protein superfamily during

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Gene order in Rosid phylogeny, inferred from pairwise syntenies among extant genomes

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Plant of the Day Isoetes andicola

Changes in Twelve Homoeologous Genomic Regions in Soybean following Three Rounds of Polyploidy W

Supplemental Table 1. Primers used for cloning and PCR amplification in this study

Guided Notes: Evolution. is the change in traits through generations over! Occurs in, NOT individual organisms

Slovene Plant Gene Bank (SPGB) and Genetic Resources Programme

EVOLUTION. Evolution - changes in allele frequency in populations over generations.

Case Study. Who s the daddy? TEACHER S GUIDE. James Clarkson. Dean Madden [Ed.] Polyploidy in plant evolution. Version 1.1. Royal Botanic Gardens, Kew

FOSSILS. Evidence of change over time

Fossils Biology 2 Thursday, January 31, 2013

Agave Genomics in Support

Chapter 27: Evolutionary Genetics

Topic 2: Plants Ch. 16,28

673 Comparative Genomics of Angiosperm MADS Box Genes Yale University, New Haven, CT. 674 The evolution of plant architecture in Brassicaceae

Gene and genome duplications: the impact of dosage-sensitivity on the fate of nuclear genes

Genome Evolution Greg Lang, Department of Biological Sciences

Section 17 1 The Fossil Record (pages )

8/23/2014. Phylogeny and the Tree of Life

Unsaved Test, Version: 1 1

Ch. 16 Evolution of Populations

Chapter 22: Descent with Modification 1. BRIEFLY summarize the main points that Darwin made in The Origin of Species.

Natural Selection. Factors for Natural Selection: 1. Variation 2. Heritability 3. Overproduction (Overpopulation) 4. Reproductive Advantage

MACROEVOLUTION Student Packet SUMMARY EVOLUTION IS A CHANGE IN THE GENETIC MAKEUP OF A POPULATION OVER TIME Macroevolution refers to large-scale

Integrative Biology 200A "PRINCIPLES OF PHYLOGENETICS" Spring 2012 University of California, Berkeley

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

Chapter 14 The Origin of Species

Cryopreservation of syn seeds

Evolution as Fact and Theory. What is a Scientific Theory? Examples of Scientific Theories:

Evolution as Fact and Theory

Unit 7: Evolution Guided Reading Questions (80 pts total)

Unit 9: Evolution Guided Reading Questions (80 pts total)

Biology. Slide 1 of 25. End Show. Copyright Pearson Prentice Hall

Evolutionary model for the statistical divergence of paralogous and orthologous gene pairs generated by whole genome duplication and speciation

Genome-Wide Identification of NBS-Encoding Resistance Genes in Sunflower (Helianthus annuus L.)

Identification and Characterization of Shared Duplications between Rice and Wheat Provide New Insight into Grass Genome Evolution W

Supplemental Data. Perea-Resa et al. Plant Cell. (2012) /tpc

Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics

Origins of Life and Extinction

Quantitative Genetics & Evolutionary Genetics

Transcription:

Bioinformatics tools to analyze complex genomes Yves Van de Peer Ghent University/VIB

Detecting colinearity and large-scale gene duplications A 1 2 3 4 5 6 7 8 9 10 11 Speciation/Duplicatio n S1 S2 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11 Gene loss, insertions, rearrangements, translocation, etc Time 2 S1 S2 1 3 4 6 7 10 11 1 2 4 6 7 8 9 11 retained orthologs (anchor points)

Detecting colinearity and large-scale gene duplications S1 S2 1 3 4 6 7 10 11 1 2 4 6 7 8 9 11 1 2 segment S1 1 3 4 6 7 X X 10 11 Gene Homology Matrix: Conserved gene content and order Collinear regions appear as diagonals segment S2 4 X 6 7 8 9 X 11

Detecting colinearity and large-scale gene duplications In an actual genome this becomes complex Good statistical model to find biologically relevant regions Real clusters A noise cluster

The Arabidopsis genome duplication The Arabidopsis Genome Initiative (2000)

Transitive homology B A C Segments B and C are homologous because both are homologous to the segment A However, segments B and C share only 1 gene because of differential gene loss

Detecting homology in the twilight zone : genomic profiles C 11 11 11 11 11 11 10 9 8 7 6 4 4 2 1 Not Not significant A 1 4 7 9 11 11 B 2 6 9 10 11 11 Not significant C 11 11 11 11 11 11 10 9 8 7 6 5 4 4 2 Not significant Building genomic profiles A 1 5 8 10 11 11 B 3 8 11 11 11 11

Searching for diagonals revisited A 1 5 8 10 11 11 B 3 8 11 11 11 11 C 2 4 4 6 7 8 9 10 11 11 11 11 11 significant homology!

Genomic profiles: increasing the multiplication level Simillion et al. (2004) Genome Res. 14, 1095-1106

i-adhore 3.0 MCSCan: Tang et al., 2008 Cyntenator: Rödelsperger et al., 2010 Fostier et al., 2011

i-adhore 3.0 is extremely efficient in analyzing large datasets

Gene loss and colinearity/synteny WGDs add to the complexity of plant genomes (both in numbers of genes and genome structure) Van de Peer et al. (2009) Trends Plant Science

Linking WGDs with great or decisive moments in evolution

Darwin s abominable mystery WGD?

Another way of unveiling WGDs: K S age distributions Continuous mode of gene duplication Number of duplicates Genome duplication events Ks or Time Corresponds with age older Maere et al. (2005) PNAS Vanneste et al. (2013) Mol. Biol. Evol.

Many plants have undergone a recent genome duplication Shi et al., (2010) Annals Bot. Cui et al., Genome Research (2006) Blanc and Wolfe, Plant Cell, (2004) K s frequency distribution (%) 16% 14% 12% 10% 8% 6% 4% 2% Ginger-Ginger Musa-Ginger Musa-Musa Musa-Rice Lescot et al. (2008) BMC Genomics (2008) 0% K s

The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

A whole-genome duplication (WGD) approximately 58 Myr ago had a major role in shaping the M. truncatula genome and thereby contributed to the evolution of endosymbiotic nitrogen fixation.

An intriguing aspect of the apple's biology concerns its characteristic fruit, the pome, which is found only in the Pyreae tribe. This indicates that the pome probably evolved after a relatively recent Pyreae-specific GWD, a polyploidization step that we hypothesize has contributed to the apple's developmental and metabolic specificity.

Calibrating the molecular clock.

Many plants have undergone a recent genome duplication Cui et al., Genome Research (2006) Organism Molecular clock rates (K S ) Arabidopsis thaliana 25 30 (8), 65 mya (11) Populus trichocarpa 13 mya (3) Vitis vinifera No recent duplication Medicago truncatula > 50 mya (29); 58 mya (10) Lotus japonicus b > 50 mya (29) Glycine max 44 mya (10) Gossypium hirsutum 13 15 mya (8) Carica papaya No recent duplication Solanum tuberosum b 50 52 mya (10) Solanum lycopersicum 50-52 mya (10) Eschscholzia californica Unknown Musa spp, 61 mya (13) Oryza sativa 50 60 mya (10); 70 mya (6) Sorghum bicolor b 50 60 mya (10); 70 mya (6) Acorus americanus Unknown Physcomitrella patens Unknown

Absolute dating through the construction of phylogenetic trees Ath 1 Ath 2 duplication speciation Ath 1 Ath 2 poplar rice moss Time (mya)

Absolute dating through the construction of phylogenetic trees

Duplications are clustered in time The Cretaceous Tertiary (KT) extinction event is known as the most recent large-scale mass extinction of animal and plant species in a geologically short period of time, approximately 65 mya. Fawcett et al. (2009) PNAS Vanneste et al. (2014) Genome Research (accepted, pending revisions)

Jurassic Cretaceous Tertiary 75 50 25 mya Monocots 100 Eudicots 125 Asterids 150 Eurosids II 175 Eurosids I Cucumis melo Cucumis sativus Citrullus lanatus Malus domestica Pyrus bretschneideri Prunus persica Prunus mume Fragaria vesca Glycine max Cajanus cajan Medicago truncatula Cicer arietinum Lotus japonicus Ricinus communis Manihot esculenta Jatropha curcas Linum usitatissimum Populus trichocarpa Brassica rapa Thelungiella parvula Arabidopsis thaliana Arabidopsis lyrata Carica papaya Theobroma cacao Gossypium raimonddi Eucalyptus grandis Vitis vinifera Solanum lycopersicum Solanum tubersomum Lactuca sativa Nelumbo nucifera Aquilegia formosa x pubescens Brachypodium distachyon Hordeum vulgare Oryza sativa Zea mays Sorghum bicolor Setaria italica Musa acuminata Phoenix dactylifera Phaeodactylum equestris Nuphar advena Physcomitrella patens

65 mya: the Cretaceous Tertiary (KT) extinction event Hiroshima:15 kiloton Tsar Bomba: 57 megaton K-T impactor: 2 million times Tsar Bomba

65 mya: the Cretaceous Tertiary (KT) extinction event These catastrophic events have: caused huge wildfires reduced sunlight and prolonged darkness caused temperatures to drop caused acid rain As a result, there was: hindered photosynthesis reduced germination of seeds abrupt extinction or local disappearance of terrestrial vegetation Paleobotanical studies of fossil pollen, spores and leaves from North American localities have shown that 18-30% of plant genera and families and up to 60% of plant species disappeared at the KT boundary.

Polyploids: evolutionary dead ends or hopeful monsters? Although polyploids occur frequently in the wild under natural conditions Under normal circumstances Not many survive, not many get established, not many proliferate Many polyploids have reduced fertility, reduced fitness,

Polyploids: evolutionary dead ends or hopeful monsters? Stress Environmental fluctuations Extinction Hybridization Unreduced gametes NEUTRAL Polyploid formation ADAPTIVE Transgressive segregation Phenotypic variability/plasticity Environment Rapid adaptive changes facilitated by the plastic genomic background Polyploid establishment Fawcett et al. (2009) PNAS Van de Peer et al. (2009) Nat Rev Genet Vanneste et al. (2014) Genome Research (accepted, pending revisions)

Acknowledgements