Genetic diversity and population structure in rice. S. Kresovich 1,2 and T. Tai 3,5. Plant Breeding Dept, Cornell University, Ithaca, NY

Similar documents
Classical Selection, Balancing Selection, and Neutral Mutations

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

What are we learning from genome-wide association studies (GWAS) in rice?

New insights into the history of rice domestication

The long and the short of it: SD1 polymorphism and the evolution of growth trait divergence in U.S. weedy rice

Principles of QTL Mapping. M.Imtiaz

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Through the genetic bottleneck: O. rufipogon as a source of trait-enhancing alleles for O. sativa

RFLP facilitated analysis of tiller and leaf angles in rice (Oryza sativa L.)

New imputation strategies optimized for crop plants: FILLIN (Fast, Inbred Line Library ImputatioN) FSFHap (Full Sib Family Haplotype)

Migration, isolation and hybridization in island crop populations: the case of Madagascar rice

Eiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6

Lecture WS Evolutionary Genetics Part I 1

Origin and Dissemination of Cultivated Rice in the Eastern Asia

Similar traits, different genes? Examining convergent evolution in related weedy rice populations

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Maize Genetics Cooperation Newsletter Vol Derkach 1

Genetic structure and eco-geographical differentiation of cultivated Hsien rice (Oryza sativa L. subsp. indica) in China revealed by microsatellites

p(d g A,g B )p(g B ), g B

Exam 1 PBG430/

Fei Lu. Post doctoral Associate Cornell University

Genetic and physiological approach to elucidation of Cd absorption mechanism by rice plants

Neutral Theory of Molecular Evolution

Segregation distortion in F 2 and doubled haploid populations of temperate japonica rice

Evolutionary Genetics: Part 0.2 Introduction to Population genetics

Wheat Genetics and Molecular Genetics: Past and Future. Graham Moore

Managing segregating populations

HyunJung Kim 1, Eung Gi Jeong 2, Sang-Nag Ahn 3, Jeffrey Doyle 4, Namrata Singh 1, Anthony J Greenberg 1, Yong Jae Won 2 and Susan R McCouch 1*

Title. Authors. Characterization of a major QTL for manganese accumulation in rice grain

MOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES

Linkage and Linkage Disequilibrium

Microsatellite data analysis. Tomáš Fér & Filip Kolář

Gene Action and Combining Ability in Rice (Oryza sativa L.) Involving Indica and Tropical Japonica Genotypes

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genetics & Evolutionary Genetics

Gene Flow Between Crops and Their Wild Progenitors

Evolution of phenotypic traits

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012

Big Idea #1: The process of evolution drives the diversity and unity of life

(Genome-wide) association analysis

Lecture 9. Short-Term Selection Response: Breeder s equation. Bruce Walsh lecture notes Synbreed course version 3 July 2013

Evolutionary change. Evolution and Diversity. Two British naturalists, one revolutionary idea. Darwin observed organisms in many environments

Lecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013

Evolution of Endosperm Starch Synthesis Pathway genes in the Context of Rice: Oryza sativa) Domestication

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

BREEDING, GENETICS, AND PHYSIOLOGY. Phenotypic Analysis of the 2006 MY2 Mapping Population in Arkansas

Developing Marker-Assisted Selection Strategies for Breeding Hybrid Rice

Lecture 9. QTL Mapping 2: Outbred Populations

Supporting Information

Constructing a Pedigree

The Evolutionary Genetics of Seed Shattering and Flowering Time, Two Weed Adaptive Traits in US Weedy Rice

The evolving story of rice evolution

Calculation of IBD probabilities

Science Unit Learning Summary

Calculation of IBD probabilities

Germplasm. Introduction to Plant Breeding. Germplasm 2/12/2013. Master Gardener Training. Start with a seed

Introduction to Plant Breeding. Master Gardener Training

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

When one gene is wild type and the other mutant:

Chapter 5 Evolution of Biodiversity. Sunday, October 1, 17

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

Notes on Population Genetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Conservation Genetics. Outline

Week 7.2 Ch 4 Microevolutionary Proceses

1. they are influenced by many genetic loci. 2. they exhibit variation due to both genetic and environmental effects.

Intraspecific gene genealogies: trees grafting into networks

Chapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype.

ORYZA glaberrima (Steud.) is a form of cultivated in West Africa offers a view into the early stages of

BIOL 502 Population Genetics Spring 2017

2. Der Dissertation zugrunde liegende Publikationen und Manuskripte. 2.1 Fine scale mapping in the sex locus region of the honey bee (Apis mellifera)

The domestication of cultivated plant species from their wild

CONSERVATION AND THE GENETICS OF POPULATIONS

4/26/18. Domesticated plants vs. their wild relatives. Lettuce leaf size/shape, fewer secondary compounds

Populations in statistical genetics

Identification of Trait-Improving Quantitative Trait Loci Alleles From a Wild Rice Relative, Oryza rufipogon

You are encouraged to answer/comment on other people s questions. Domestication conversion of plants or animals to domestic uses

Quantitative Genetics

Duplication of an upstream silencer of FZP increases grain yield in rice

Selection Methods in Plant Breeding

Statistical issues in QTL mapping in mice

Combining Ability and Heterosis in Rice (Oryza sativa L.) Cultivars

Genotype Imputation. Biostatistics 666

Genetic and Evolutionary Analysis of Purple Leaf Sheath in Rice

NCEA Level 2 Biology (91157) 2017 page 1 of 5 Assessment Schedule 2017 Biology: Demonstrate understanding of genetic variation and change (91157)

Quantitative trait loci mapping of the stigma exertion rate and spikelet number per panicle in rice (Oryza sativa L.)

Computational Approaches to Statistical Genetics

Enduring Understanding: Change in the genetic makeup of a population over time is evolution Pearson Education, Inc.

Mutation, Selection, Gene Flow, Genetic Drift, and Nonrandom Mating Results in Evolution

Genetics 275 Notes Week 7

Phylogeography of Asian wild rice, Oryza rufipogon: a genome-wide view

Sexual Reproduction and Genetics

Major questions of evolutionary genetics. Experimental tools of evolutionary genetics. Theoretical population genetics.

Curriculum Links. AQA GCE Biology. AS level

Genetic Association Studies in the Presence of Population Structure and Admixture

GSBHSRSBRSRRk IZTI/^Q. LlML. I Iv^O IV I I I FROM GENES TO GENOMES ^^^H*" ^^^^J*^ ill! BQPIP. illt. goidbkc. itip31. li4»twlil FIFTH EDITION

Evidence of Evolution

Nature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions.

2. Map genetic distance between markers

BTRY 7210: Topics in Quantitative Genomics and Genetics

Transcription:

Genetic diversity and population structure in rice S. McCouch 1, A. Garris 1,2, J. Edwards 1, H. Lu 1,3 M Redus 4, J. Coburn 1, N. Rutger 4, S. Kresovich 1,2 and T. Tai 3,5 1 Plant Breeding Dept, Cornell University, Ithaca, NY 2 Institute for Genomic Diversity, Cornell University, Ithaca, NY 3 Pioneer Hybrid, Ames, Iowa 4 Dale Bumpers National Rice Research Center, Stuttgart, AK 5 USDA-ARS, ARS, University of California, Davis, CA Objectives Analyze population structure and the extent of linkage disequilibrium (LD) in domesticated Asian rice, O. sativa Interpret the demographic history of rice Use information @ diversity and LD to predict resolution of association mapping using available rice germplasm collections 1

What is population structure and how is it detected? Population structure refers to the presence of genetically identifiable sub-groups within a population It is evaluated by testing for non-random associations among alleles (linkage disequilibrium, or LD) at random loci across the genome (Pritchard et al., 2000, Genetics 155:945-959) 959) A modest number (7-30) of unlinked DNA markers (SSR, RFLP, SNP, etc.) is generally sufficient to provide evidence of population structure Why is population structure important? Knowledge of population structure is essential for designing efficient association or LD mapping strategies. Spurious associations between a marker and a phenotype (Type 1 error) may occur if population structure is not controlled Understanding population structure helps interpret the evolutionary history of a species 2

Oryza accessions used in this study O. sativa - 380 accessions: 145 US elites & 235 landraces O. glaberrima - 198 landrace accessions Wild species - 63 accessions: 2 target species, 4 outgroups 1 2 3 4 5 6 7 8 9 10 11 12 Genotypes: O. sativa - 169 nuclear SSRs O. glaberrima - 98 nuclear SSRs Wild species - 60 nuclear SSRs 2 chloroplast sequences 3

Chloroplast sequence analysis A deletion in the linker sequences between ribosomal protein genes rp116 and rp114 is associated with indica plastid subtype-identity (PS-ID) (Nakamura et al., 1997) A variable SSR motif is found in the ORF 100 fragment Chloroplasts are maternally inherited, do not generally recombine & have a lower mutation rate than nuclear loci Chloroplast haplotypes provide a long-term record of maternal geneology,, a measure of maternal diversity & evidence of sub-group hybridization Statistical analysis Genetic distance (CS Chord) (Cavalli-Sforza & Edwards, 1967) Neighbor-joining (PowerMarker) (K Liu & S Muse) Model-based (STRUCTURE) (Pritchard et al., 2000) An accession was assigned to a population if the inferred proportion of its ancestry in that population was > 0.8 F st (for O. sativa) ) or > 0.75 st (for O. glaberrima) F st 4

Evidence of extensive and ancient population structure in O. sativa Five major sub-populations were detected in the collection of 235 landrace varieties using both distance and model-based methods: Indica Aus Tropical japonica Temperate japonica Aromatic (79 accessions) (20 accessions) (44 accessions) (46 accessions) (19 accessions) Five sub-populations in O. sativa Indica Tropical Japonica Aus Temperate Japonica Aromatic 5

Relative estimates of average total polymorphism among sub-populations 0.7 0.6 0.5 0.4 0.3 No. alleles PIC values He values 0.2 0.1 0 Indica Aus Tropical japonica Temperate japonica Aromatic Partitioning of nuclear allelic diversity Aus and indica embody more diversity than the japonicas. The temperate japonicas had 15 monomorphic loci and in each case, the allele represented was the most frequent allele among the tropical japonicas.. This suggests they may be derived from tropical japonica. The aromatics had 21 monomorphic loci and 15 of them represented the most frequent allele in the tropical japonicas; ; many were also the same monomorphic alleles found in the temperate japonicas.. However, they harbored many population-specific alleles as well. 6

Population bottlenecks Differences in standardized allele lengths Stepwise Mutation Model (Vigouroux et al, 2003) Indica Indica ~ 0.48 (NS) 1.01 (NS) 4.07* 6.22* Aus ~ 1.18 (NS) 6.65* 11.47* Aromatic ~ 6.40* 13.13* Trop japonica ~ 9.31* Temp japonica ~ showing stepwise mutation * = p < 0.0001 Used 60 SSRs showing stepwise mutation Aus Aromatic Tropical Temp japonica japonica Derived states Longer average allele sizes suggests more recent evolution due to upwards bias in repeat number of hypervariable SSRs with time. Allele lengths in tropical and temperate japonica were significantly longer than in indica, aus and aromatic. Average allele size in temperate japonica is significantly greater than in tropical japonica Supports hypothesis that temperate group is derived from tropical japonica population 7

Progenitors What are the immediate ancestors of the indica and japonica groups? Is there evidence of pre-domestication differentiation of indica and japonica within O. rufipogon? What can the evolutionary history of O. sativa and its closest ancestors tell us about where to find useful alleles and allele combinations? Wild Germplasm 35 O. rufipogon 20 O. nivara 2 O. meridionalis 60 nuclear SSRs 8

O. rufipogon O. nivara O. sativa PCA analysis 9

Indica O. rufipogon O. nivara Temperate Japonica Tropical Japonica Aus Aromatic O. rufipogon 0.1 O. sativa, O. rufipogon,, and O. nivara PCA analysis 10

K=2 K=7 Indica-like Aus-like O. rufipogon I O. rufipogon II O. nivara I O. nivara II Basal group View from the chloroplast Does chloroplast sequence help clarify the origins of O. sativa sub-populations with respect to O. rufipogon and O. nivara? What can we learn about the evolutionary history of cultivated rice and its closest ancestors from chloroplast polymorphisms? 11

Chloroplast haplotypes overlaid on nuclear SSR clusters in O. sativa 8 chloroplast haplotypes clustered into 2 major groups, corresponding to the indica and japonica sub-species species Indica Tropical Japonica ND NI C6 A7 C ND I C6 A7 C ND NI C7 A6 C D NI C8 A8 T D NI C8 A8 C D NI C6 A8 T D NI C9 A7 T D NI C7 A7 T Missing data Aus Aromatic Temperate Japonica Partitioning of chloroplast diversity The indica group also harbored the most chloroplast diversity, (7/8 haplotypes) ) and contained all haplotypes found in the japonicas. The auses contained 4 haplotypes; ; most corresponded to the predominant indica haplotype,, but the predominant japonica haplotype was also represented. The japonicas had only 2 haplotypes and both were shared between the temperate and tropical groups. The aromatics contained both japonica haplotypes and about 15% contained a unique chloroplast haplotype that distinguished this group. 12

Wild & cultivated chloroplast haplotype frequencies 100% 90% 80% 70% 60% 50% 40% 30% D C9A7 D C8A8 D C7A7 D C6A8 ND C7A6 ND C6A7 ND C6A6 ND C5A7 20% 10% 0% O. nivara O. rufipogon Indica Aus Aromatic Tropical Japonica Temperate Japonica Chloroplast results O. rufipogon has intermediate haplotype frequencies. O. nivara has a high frequency of haplotypes not shared with O. sativa. The chloroplast haplotypes in O. rufipogon do not correlate with nuclear similarity to O. sativa sub- populations. Does this suggest gene flow through pollen? Evidence of recent admixture? 13

Domestication and dispersal of O. sativa O. rufipogon Is population sub-structure structure observed among elite US varieties? 115 US-bred varieties and 30 ancestral accessions (originally introduced from Asia) Collection spans varieties released over last 90 years and represents 90% of all US rice varieties registered in Crop Science between 1965-2000. 169 nuclear SSRs (1 every 10 cm) 14

Evaluation of elite US varieties Four time periods were compared: 1900-1929 (18 cultivars, imported from Asia) 1930-1959 (31 cultivars developed or introduced) 1960-1979 (44 cultivars, first intro of semi-dwarf character) 1980-2000 (52 cultivars, many contained sd1) Grain type extracted from GRIN Plant stature extracted from Registration in Crop Science I II III Three major sub- populations in a set of 18 rice cultivars brought to the US from Asia 1900-1929 1929 I = Temperate japonica; short-medium grains II = Tropical japonica; medium grains III =Tropical japonica; long grains 15

Temperate Japonica Temperate japonica Tropical japonica- grain Tropical Japonica - medium grain Within- & between-population crosses 1x1 1x2 1x4 1x? Temperate japonica; 24 varieties since 1930 63% Group 1 X 1 13% Group 1 X 2 8% Group 1 X indica 2X2 2X1 2X3 2X3X1 2X? Tropical japonica grain Tropical Japonica - long grain 3X3 3X2 3X1 3X1X2 3X4 3X? Tropical japonica; med grain 22 varieties since 1930 36% Group 2 X 2 27% Group 2 X 3 18% Group 2 X 1 Tropical japonica; long grain 69 varieties since 1930 61% Group 3 X 3 12% Group 3 X 2 9% Group 3 x indica How large are the linkage blocks in rice? How much recombination has there been in 10,000 years of domestication and breeding? In 90-100 years of rice improvement in the US? Can we use this SSR dataset (169 markers) to determine how extensive the linkage blocks are in landraces and in elite US cultivars? Dilday suggested that all US Tropical japonica (Southern rice belt) cultivars traced back to 22 rice introductions, and Temperate japonica cultivars trace back to 23 introductions (~ 7 in common). Does this bottleneck affect LD decay? 16

Estimating LD decay Instead of using unlinked markers, LD decay is evaluated by looking at markers known to be linked in specific regions of the genome. At what physical or genetic distance along the chromosome do markers cease to be inherited as a linked unit (rate of LD decay)? How close do markers need to be to reliably predict the presence of a gene controlling a phenotype? What resolution can be expected from association mapping using different groups of germplasm and targeting different regions of the genome? At what distance does LD decay in a collection of Aus landraces? The location of xa5 resistance gene had been narrowed to a 90 kb BAC by high resolution mapping (using( 3,000 F2 segregants from IR24 x IRBB5 pop) (Blair et al., 2003) Wanted to determine whether we could narrow it down further with association mapping based on only 114 Aus accessions SNPs were evaluated every 5 kb across the 90 kb region All Aus accessions were inoculated with Xoo and evaluated for lesion length Which genes/markers were significantly associated with resistance? 17

What is the resolution of association mapping around the xa5 gene for resistance? 26 haplotypes among 114 accessions in a 90 kb region no test p<0.01 p<0.005 p<0.001 p<0.0001 p<0.00001 p<0.000001 p<0.0000001 Chr.. 5 * 45216 * 45506 * 45507 * 50279 * 65270 * 65378 * 65443 * 65456 * 70348 * 70405 * 70482 * 70506 * 70517 * 70607 * 70609 * 70619 * 75560 * 80367 * 80368 * 80488 * 86146 * 91474 * 91480 * 101406 * 101412 * 101427 * 101466 * 101489 * 101512 * 101534 * 101552 * 101599 * 106288 * 114461 SNP marker Distance: kb 40 50 60 70 80 90 100 110 120 Candidate genes Put. ABC transporter TFIIa Q94HL4 trna Q94HL2 put. Kinase Q94HK8 synthase LD ~ 75-100 kb within xa-5 5 resistant landrace accessions Recombinational vs association mapping Association mapping with 114 landraces gave same resolution as 3,000 F2 s s evaluated for recombination @ xa5,, but association mapping required prior knowledge of chromosomal target region (obtained by linkage mapping) To clone the xa5 gene required fine-mapping using an additional 2,350 recombinants. Association analysis was then used to identify the functional nucleotide polymorphism (FNP) within the TFIIA gene Association mapping targets greater number of alleles, but recombinational fine-mapping offers greater resolution (in rice) and is more amenable to primary QTL analysis, particularly for traits involving epistasis 18

LD decay expected to be faster in landraces than in elite germplasm Many generations of recombination since last common ancestor increases rate of LD decay (decreases size of linkage blocks) Rapid LD decay means greater resolution for association mapping, but requires large number of markers to find markers associated with the trait Slow LD decay increases likelihood that distantly linked markers are co-inherited (i.e., can predict presence of genes controlling a phenotype of interest) Slow LD decay means smaller number of markers needed to detect significant genotype-phenotype phenotype associations At what distance does LD decay in a collection of US long-grain grain Tropical japonica cultivars? 1 cm 250 kb Extent of LD on different chromosomes 60 50 r2 = 0.06 r2 = 0.10 40 cm 30 20 10 0 1 2 3 6 8 9 10 11 chromosome 1 2 3 4 5 6 7 8 19

LD decay in elite US cultivars How closely linked do markers have to be to show common inheritance in a random set of germplasm?? Is LD detected in US germplasm using our set of 169 markers? If so, at what distance from each other is linkage detected? No decay by distance detected in landraces w/ 169 SSR markers, but significant decay by distance in elite cultivars w/ same SSR markers (less time since last common ancestor = genetic bottleneck) Gene discovery unlikely using Association Mapping in cultivated rice because LD persists beyond the resolution of a single gene Extent of LD in elite vs.. landrace varieties of rice r 2 In elite US germplasm,, LD decays at ~ 5-20 cm (1,250-5,520 kb) Genome-wide L 0.7 0.6 0.5 0.4 0.3 0.2 0.1 r 2 1 0.8 0.6 0.4 In landrace varieties, LD decays at ~ 0.25 cm (100 kb) 10-50X faster 0 0 20 40 60 80 100 120 140 160 180 200 220 0 10 20 30 40 50 60 70 80 90 100 distance in cm 0.2 0 0 20 40 60 80 100 120 distance in kb 20

Diversity within and between populations Pairwise F st values 38% of the variation in rice was due to differences among groups compared to 8-12% 8 in a comparable sample of diverse maize inbreds (Liu et al, 2005) rice Rice 1 2 maize Maize Effect of inbreeding vs out-crossing mating habit 1 multiple 2 vs single domestication events. In rice, 62% of the variation was due to differences w/in groups, suggesting inherently high levels of diversity. What density of marker coverage is required for efficient LD mapping? In the Aus landrace collection, LD was around 100 kb, so to hit each LD block, need 1 marker every 100 kb ==> 4,300 markers. In elite US germplasm,, LD is expected to be much larger (~20X), so to sample each LD block, need ~ 1 marker every 1,000 kb => 430 markers? Blocks of LD are not uniform in size throughout the genome; need more information about recombinational hot & cold spots in different rice populations to determine best number and distribution of markers 21

For consideration Determine population structure in a large, diverse set of landrace and elite materials using ~ 50 SSRs (coordinate with Challenge Program) Develop a community information resource with genotypic scores on selected set of genetically diverse elite materials (using ~500 SSRs) Distribute seeds of inbred lines, DNA and make all polymorphism information public so researchers can add specific markers and phenotypic scores for LD and association studies Develop complementary [sib-mated] recombinant inbred (RI) populations for use in high resolution mapping, candidate gene analysis, FNP detection, comparative mapping, etc. Acknowledgements Tom Tai Mark Redus Amanda Garris Hong Lu Jason Coburn Jeremy Edwards Steve Kresovich Neil Rutger Funding: USDA CREES, ARS, NRI, NSF Plant Genome, Rockefeller Foundation, PCMB grant and a National Needs Fellowship to AG; IRGC GC- IRRI for seeds; The Institute of Genomic Diversity, Cornell University for chloroplast and SNP sequence 22

Pairwise F st values # indiv. Source of variation # pops Bt/n Within O. sativa Landraces 235 5 0.38 0.62 Elite US 114 3 0.49 0.51 O. glaberrima Landraces 108 3 0.14 0.86 Z. mays Landraces 102 3 0.12 0.88 23