Multi-environment GWAS and genomic prediction: penalized regression and mixed model approaches

Size: px
Start display at page:

Download "Multi-environment GWAS and genomic prediction: penalized regression and mixed model approaches"

Transcription

1 Multi-environment GWAS and genomic prediction: penalized regression and mixed model approaches Montpellier, 9 June 2015 Fred van Eeuwijk & Willem Kruijer Daniela Bustos Korts, Marcos Malosetti, & Martin Boer WUR-Biometris, Wageningen, The Netherlands

2 WU contribution to EU-DROPS Development of multi-environment GWAS and genomic prediction pipeline with facilities for modelling GxE/QTLxE Incorporation of genetic (phenotyping platforms) and environmental characterizations (envirotyping) in multienvironment models; Predictions for new genotypes in new environments Generalizations to multi-trait multi-environment models Additional constraints on genetic correlations between traits and environments Network models / Crop growth models

3 Two-way fixed genotype x environment interaction ANOVA y ij = μ + g i + e j + ge ij + ε ij μ ij = μ + g i + e j + ge ij ; ε ij ~N(0, σ 2 ) FW ge ij = β i E j + δ ij A AMMI ge ij = a=1 r ai c ai + δ ij P PCA/ GGE/ SREG g ij = p=1 r pi c pi + δ ij F F Factorial regression ge ij = f=1 β fi z fj + δ ij ; ge ij = f=1 x fi α fj + δ ij Grouping ge ij = rc kl + δ i(k)j(l)

4 Multi-environment QTL mapping (linkage & LD) & genomic prediction y ij = μ j + Q q=1 x iq α jq + G ij + ε ij Phenotype = Environmental mean + Environment specific QTLs + Environment specific polygenic genetic effect + Environment specific error {G ij }~MVN(0, Σ); {ε ij }~MVN(0, R) Σ = Σ G Σ E ; Σ G = A for assocation mapping and Σ G = I for QTL mapping Σ E = ΓΓ + Ψ kinship, pedigree, markers (A = WW ) group unstructured, smoothed, diagonalised factor analytic group unstructured, environmental kinship (Jarquin et al., 2014) Environment specific QTLs linear function of environmental covariates y ij = μ j + Q x iq q=1 γ q + δ q z j + G ij + ε ij

5 So m e r elevant p ap er s 4th Annual meeting 25th 29th March 2014, Aachen, Germany

6 Cit at io ns t o Gx E m et ho d o logy p ap er s: Daniela Bu st o s 4th Annual meeting 25th 29th March 2014, Aachen, Germany

7 Ignacio Romagosa Steptoe X Morex: QTLxE for grain yield QTL effects Ppd-H1 C3P51 C6P65 QTL effects ID91 ID91 MAN92 MAN92 MTd91 ID92 MTd91 MTd92 MTd92 MTi91 MTi92 MTi91 SC03 MTi92 SC05 SC03 SKs92 SC05 WA91 SKs92 WA92 WA91 WA Cross-over interaction: 1 Depending 2 3on the site 4 the contribution 5 of 6 the Morex 7 allele can be either positive or negative. Chromosomes Chromosomes There is a clear genetic control of heading, but depending on the meteorological conditions through grain filling it could be either positive or negative to have early heading.

8 L effects QTL effects L effects ID91 QTL effects ID91 ID91 ID91 MAN92 ID92 MAN92 ID92 MAN92 MTd91 MAN92 MTd91 MTd91 MTd92 MTd91 MTd92 MTd92 MTi91 MTd92 MTi91 MTi91 MTi92 MTi91 MTi92 MTi92 SC03 MTi92 SC03 SC03 SC05 SC03 SC05 SKs92 SC05 SKs92 SC05 SKs92 WA91 SKs92 WA91 WA91 WA92 WA91 WA92 WA92 WA ,80 0,60 0,40 0,20 0,00-0,20-0,40-0,60 Steptoe x Morex: Factorial Regression Yield QTL effect (t/ha): Ppd-H1-0,80 8,0 10,0 12,0 14,0 16,0 18,0 Tdif_H Ppd-H1 Morex allele (yellow-red) non-responsive at long photoperiod better adapted to intermediate Tdif_H Sco05 Sco Chromosomes Chromosomes 5 5 SK92 WA MAN MTi92 WA92 MTd92 MTi91 MTd ID91 ID92 R 2 = 0.784; p=0.0002

9 QTL-effect QTLxE 0.20 NP WN94 NP AD94 SV94 GC95 MR CI95 YK YA PR SV average min. temperature period 4

10 Smart tools for Prediction and Improvement of Crop Yield RIL population (n=149) Parents: Yolo Wonder, CM 334 Greenhouse experiments, two locations x two seasons Netherlands (Wageningen) & Spain (Almeria) Crop measurements Plant weight (stem, leaves, fruit; initial and final) Number of internodes Leaf area (initial and final) Fruit harvest (number, weight) Environment Temperature Radiation

11 Multi-trait multi-environment QTL analysis Environments Environments -log10(p -log10(p) Traits x Environments QT L e ffe c ts Ax l.nl 1 Ax l.nl 2 Ax l.sp1 Ax l.sp2 DW F.NL 1 DW F.NL 2 DW F.SP1 DW F.SP2 DW L.NL 1 DW L.NL 2 DW L.SP1 DW L.SP2 DW S.NL 1 DW S.NL 2 DW S.SP1 DW S.SP2 DW V.NL 1 DW V.NL 2 DW V.SP1 DW V.SP2 INL.NL 1 INL.NL 2 INL.SP1 INL.SP2 L AI.NL 1 L AI.NL 2 L AI.SP1 L AI.SP2 L UE.NL 1 L UE.NL 2 L UE.SP1 L UE.SP2 NF. NL 1 NF. NL 2 NF. SP1 NF. SP2 NI.NL 1 NI.NL 2 NI.S P1 NI.S P2 NL E.NL 1 NL E.NL 2 NL E.SP1 NL E.SP2 p t_ frt.nl 1 p t_ frt.nl 2 p t_ frt.sp 1 p t_ frt.sp 2 p t_ l e a f.nl 1 p t_ l e a f.nl 2 p t_ l e a f.s P1 p t_ l e a f.s P2 SL. NL 1 SL. NL 2 SL. SP1 SL. SP2 SL A.NL 1 SL A.NL 2 SL A.SP2 SL A.SP Test profile and additive effects: MTME QT L e ffe c ts Ax l.nl 1 Ax l.nl 2 Ax l.sp1 Ax l.sp2 DW F.NL 1 DW F.NL 2 DW F.SP1 DW F.SP2 DW L.NL 1 DW L.NL 2 DW L.SP1 DW L.SP2 DW S.NL 1 DW S.NL 2 DW S.SP1 DW S.SP2 DW V.NL 1 DW V.NL 2 DW V.SP1 DW V.SP2 INL.NL 1 INL.NL 2 INL.SP1 INL.SP2 L AI.NL 1 L AI.NL 2 L AI.SP1 L AI.SP2 L UE.NL 1 L UE.NL 2 L UE.SP1 L UE.SP2 NF. NL 1 NF. NL 2 NF. SP1 NF. SP2 NI.NL 1 NI.NL 2 NI.S P1 NI.S P2 NL E.NL 1 NL E.NL 2 NL E.SP1 NL E.SP2 p t_ frt.nl 1 p t_ frt.nl 2 p t_ frt.sp 1 p t_ frt.sp 2 p t_ l e a f.nl 1 p t_ l e a f.nl 2 p t_ l e a f.s P1 p t_ l e a f.s P2 SL. NL 1 SL. NL 2 SL. SP1 SL. SP2 SL A.NL 1 SL A.NL 2 SL A.SP1 SL A.SP Chromosomes Chromosomes How useful is multi-trait GWAS? 11 12

12 Frequency Frequency Frequency Number of QTLs and explained variance when increasing model complexity, without going to genomic prediction Figure 7: Histogram of Explained Variance by individual QTLs as detected by ME, MT and MTME analyses. MTME produced far more QTLs than ME and MT but many of the extra QTLs from MTME are of small effects. 8a: Histogram of Explained Variance8b: Histogram of Explained Variancc: Histogram of Explained Variance ME MT MTME Percentage of Explained Variance Percentage Explained Variance by significant QPercentage Explained Variance by significant Percentage Explained Variance by significant Multi-variate (association) analysis is more powerful, especially in the case when not all traits are associated with the genetic variant being tested. Stephens, PLOS One, 2013

13 EU-DROPS example data 253 genotypes 23 environments = management x location x year 15 environmental characterizations/ indices 333k SNPs (S. Négro, S. Nicolas, A. Charcosset) example trait: grain yield

14 Multi-environment GWAS in DROPS At start DROPS, state of the art multi-trait GWAS (GEMMA): fast GWAS up till 10 traits Improvements for DROPS: GWAS up till 100 traits Advantages multi-trait GWAS: more power, higher resolution Allows testing of contrasts for QTLxE Strategies in GWAS modelling and computation Structured models for correlations between genotypes and between environments Increased speed via diagonalization (Zhou & Stephens, 2014), 12h for 253 genotypes, 23 environments and 333k SNPs Improved convergence by EM algorithms (Dahl et al., 2014)

15 Grain yield y ij = μ j + x i α j + G ij + ε ij for each SNP Test H 0 : α 1 =α 2 = = α 23 =0 When H 0 is rejected regress α 1,... α 23 on environmental characterizations/contrasts/indices

16 Regressions of QTL effects on contrasts/indices/characterizations

17 Multi-environment genomic prediction for DROPS Implemented mixed model approach by Dahl et al. (2014). Fast: O(np 2 +p 3 ) instead of O(n 3 p 2 ) QTLs can be included in fixed part model Predictions for new environments via creation environmental relationship matrix on the basis of environmental characterizations Problems Difficult to find appropriate way of including environmental characterizations in relationship matrix. Jarquin et al (2013) uses equal weights, and include genotypic and environmental main effects, so only modelling GE ij term and not G ij. Genotypic main effect, G i : include explicit genotypic main effect or model with main effect QTLs? In the latter case, how many SNPs to include in fixed part? For random part, should all SNPs have equal variance? See Bayes RC and multi-blup approaches for alternatives For random part, should all SNPs have equal weight?

18 Genomic prediction by penalized regression Elastic net: combination of Lasso (L1 penalty) and ridge regression (L2 penalty) Problem 333k SNPs x 15 environmental indices produce more than 5 million interaction terms SNPs are chosen on basis of single environment GWAS scans + to correct for kinship/population structure 10 marker based principal components = about 500 genetic predictors m y ij = X G i,l β G q l=1 l + X E E j,k β k m,q k=1 + l=1,k=1 (X i,l G X j,k E )β lk GE + ε ij Different penalties on β l G, β k E, and β lk GE Accuracies for leave one environment out:

19 Some conclusions for DROPS multi-trait GWAS and genomic prediction Algorithms for fitting multi-trait mixed models developed and implemented Successfully tested for GWAS (power & resolution) Faster than ASReml-R/Genstat, more traits than GEMMA, no convergence problems Environment specific QTL effects regressed on environmental characterizations Further work necessary for defining environmental relations in multi-environment genomic prediction Incorporation of phenotyping platform data

20 Extra slides 1 Fred van Eeuwijk

21 Learning from Nature Genetic architecture underlying resistance to biotic and abiotic stresses in Arabidopsis: A multi-trait genome-wide association mapping approach Fred van Eeuwijk, Marcel Dicke Silvia Coolen, Nelson Davila Olivas, Pingping Huang, Karen Kloth, Manus Thoen Willem Kruijer, Joost van Heerwaarden

22 Learning From Nature Introduction Material Plants exposed to biotic and abiotic stresses, including combined stresses Biotic drought, salt, osmotic, heat Abiotic parasitic plant, phloem feeding aphid, phloem feeding whitefly, cell-content feeding thrips, leaf chewing caterpillar, necrotrophic fungus Combined fungus & caterpillar, drought & fungus, drought & caterpillar, caterpillar and osmotic 350 Arabidopsis accessions phenotyped in multiple experiments 250k SNPs

23 Methods Multi-trait mixed model Genetic correlations Bivariate mixed models for pairs of traits with structuring of random genotype effects by kinship calculated from full set of SNPs = covariance for genomic breeding values QTLs/ candidate genes Multi-trait mixed model GWAS Hypothesis testing within GWAS model context biotic abiotic stresses

24 Genomic (upper right) and phenotypic correlations (lower left) Phenotypic correlations low, genetic correlations show an enriched signal

25 Genomic correlations between traits (bivariate G-BLUP) Structuring of trait/ environmental correlations y it = μ t + G it + ε it ; G it = M m x im a tm

26 Multi-trait GWAS y it = μ t + Q m=1 x im a tm + G it + ε ij

27 Consistent QTLs/ genes y it = μ t + with a tm =a m Q m=1 x im a tm + G it + ε ij

28 Biotic vs abiotic stress y it = μ t + Q m=1 x im a tm with a tm =a t,abiotic or a tm =a t,biotic + G it + ε ij

29 Extra slides 2 Fred van Eeuwijk

30 Crop growth model pepper (integration over time) yield = f(hi, LAI-rate, LUE; Temperature, Light) + error y ij target = f yi components ; z j dt + e ij t f y ij = 1 exp K i LAI i,j,t I j,t t=t 0 LUE i,j FTF i 1 W i FDMC i + ε i,j T j T FTF Genomic prediction for yield from inserting genomic predictions for components into crop growth model not better than genomic prediction for yield from yield itself

31 Accuracy yield predicted from QTLs/ GP for components, within environment (CV) Environment SP1 Strategy QTL_ ST BVS BRR BLA BLV QTL_ MT Direct CGM Direct SP2 CGM

32 Accuracy yield from QTLs/GP for components, across environments (CV) Environment Strategy QTL_ ST BVS BRR BLA BLV QTL_ MT SP1::SP2 CGM SP2::SP1 CGM

33 Causal structure CGM and reconstructions in Spain 1 and 2 Sparse inverse environmental kinship Crop growth model and causal reconstructions of networks for yield and its components define inverse environmental kinship/correlation matrix (compare pedigree for genotypes). The result is a sparse representation of the inverse of the environmental kinship useful for speeding up multi-environment GWAS/ genomic prediction (Willem Kruijer)

TASK 6.3 Modelling and data analysis support

TASK 6.3 Modelling and data analysis support Wheat and barley Legacy for Breeding Improvement TASK 6.3 Modelling and data analysis support FP7 European Project Task 6.3: How can statistical models contribute to pre-breeding? Daniela Bustos-Korts

More information

Evolution of phenotypic traits

Evolution of phenotypic traits Quantitative genetics Evolution of phenotypic traits Very few phenotypic traits are controlled by one locus, as in our previous discussion of genetics and evolution Quantitative genetics considers characters

More information

Some models of genomic selection

Some models of genomic selection Munich, December 2013 What is the talk about? Barley! Steptoe x Morex barley mapping population Steptoe x Morex barley mapping population genotyping from Close at al., 2009 and phenotyping from cite http://wheat.pw.usda.gov/ggpages/sxm/

More information

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013 Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic

More information

(Genome-wide) association analysis

(Genome-wide) association analysis (Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Lecture 9. QTL Mapping 2: Outbred Populations

Lecture 9. QTL Mapping 2: Outbred Populations Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred

More information

A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding

A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Professur Pflanzenzüchtung Professur Pflanzenzüchtung A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Jens Léon 4. November 2014, Oulu Workshop

More information

UvA-DARE (Digital Academic Repository)

UvA-DARE (Digital Academic Repository) UvA-DARE (Digital Academic Repository) Genetic architecture of plant stress resistance: multi-trait genome-wide association mapping Thoen, M.P.M.; Davila Olivas, N.H.; Kloth, K.J.; Coolen, S.; Huang, P.-P.;

More information

GBLUP and G matrices 1

GBLUP and G matrices 1 GBLUP and G matrices 1 GBLUP from SNP-BLUP We have defined breeding values as sum of SNP effects:! = #$ To refer breeding values to an average value of 0, we adopt the centered coding for genotypes described

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

G E INTERACTION USING JMP: AN OVERVIEW

G E INTERACTION USING JMP: AN OVERVIEW G E INTERACTION USING JMP: AN OVERVIEW Sukanta Dash I.A.S.R.I., Library Avenue, New Delhi-110012 sukanta@iasri.res.in 1. Introduction Genotype Environment interaction (G E) is a common phenomenon in agricultural

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph

More information

Lecture 9 GxE Mixed Models. Lucia Gutierrez Tucson Winter Institute

Lecture 9 GxE Mixed Models. Lucia Gutierrez Tucson Winter Institute Lecture 9 GxE Mixed Models Lucia Gutierrez Tucson Winter Institute 1 Genotypic Means GENOTYPIC MEANS: y ik = G i E GE i ε ik The environment includes non-genetic factors that affect the phenotype, and

More information

Principles of QTL Mapping. M.Imtiaz

Principles of QTL Mapping. M.Imtiaz Principles of QTL Mapping M.Imtiaz Introduction Definitions of terminology Reasons for QTL mapping Principles of QTL mapping Requirements For QTL Mapping Demonstration with experimental data Merit of QTL

More information

QTL Mapping I: Overview and using Inbred Lines

QTL Mapping I: Overview and using Inbred Lines QTL Mapping I: Overview and using Inbred Lines Key idea: Looking for marker-trait associations in collections of relatives If (say) the mean trait value for marker genotype MM is statisically different

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION doi:10.1038/nature25973 Power Simulations We performed extensive power simulations to demonstrate that the analyses carried out in our study are well powered. Our simulations indicate very high power for

More information

Methods for Cryptic Structure. Methods for Cryptic Structure

Methods for Cryptic Structure. Methods for Cryptic Structure Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases

More information

Wheat Genetics and Molecular Genetics: Past and Future. Graham Moore

Wheat Genetics and Molecular Genetics: Past and Future. Graham Moore Wheat Genetics and Molecular Genetics: Past and Future Graham Moore 1960s onwards Wheat traits genetically dissected Chromosome pairing and exchange (Ph1) Height (Rht) Vernalisation (Vrn1) Photoperiodism

More information

Variance Component Models for Quantitative Traits. Biostatistics 666

Variance Component Models for Quantitative Traits. Biostatistics 666 Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond

More information

Lecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013

Lecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013 Lecture 28: BLUP and Genomic Selection Bruce Walsh lecture notes Synbreed course version 11 July 2013 1 BLUP Selection The idea behind BLUP selection is very straightforward: An appropriate mixed-model

More information

Linear Regression. Volker Tresp 2018

Linear Regression. Volker Tresp 2018 Linear Regression Volker Tresp 2018 1 Learning Machine: The Linear Model / ADALINE As with the Perceptron we start with an activation functions that is a linearly weighted sum of the inputs h = M j=0 w

More information

Investigations into biomass yield in perennial ryegrass (Lolium perenne L.)

Investigations into biomass yield in perennial ryegrass (Lolium perenne L.) Investigations into biomass yield in perennial ryegrass (Lolium perenne L.) Ulrike Anhalt 1,2, Pat Heslop-Harrison 2, Céline Tomaszewski 1,2, Hans-Peter Piepho 3, Oliver Fiehn 4 and Susanne Barth 1 1 2

More information

Mobilizing genetic resources and optimizing breeding programs DO NOT COPY. J.-F. Rami UMR AGAP

Mobilizing genetic resources and optimizing breeding programs DO NOT COPY. J.-F. Rami UMR AGAP Mobilizing genetic resources and optimizing breeding programs J.-F. Rami UMR AGAP Genetic Diversity Outline characterization of ex situ Genetic Diversity dynamics of in situ diversity diversity and society

More information

Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics

Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),

More information

INTRODUCTION TO ANIMAL BREEDING. Lecture Nr 3. The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs

INTRODUCTION TO ANIMAL BREEDING. Lecture Nr 3. The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs INTRODUCTION TO ANIMAL BREEDING Lecture Nr 3 The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs Etienne Verrier INA Paris-Grignon, Animal Sciences Department

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu August 5, 2014 Ian Barnett

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April

More information

GWAS IV: Bayesian linear (variance component) models

GWAS IV: Bayesian linear (variance component) models GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian

More information

Detecting selection from differentiation between populations: the FLK and hapflk approach.

Detecting selection from differentiation between populations: the FLK and hapflk approach. Detecting selection from differentiation between populations: the FLK and hapflk approach. Bertrand Servin bservin@toulouse.inra.fr Maria-Ines Fariello, Simon Boitard, Claude Chevalet, Magali SanCristobal,

More information

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu June 24, 2014 Ian Barnett

More information

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

Genotypic variation in biomass allocation in response to field drought has a greater affect on yield than gas exchange or phenology

Genotypic variation in biomass allocation in response to field drought has a greater affect on yield than gas exchange or phenology Edwards et al. BMC Plant Biology (2016) 16:185 DOI 10.1186/s12870-016-0876-3 RESEARCH ARTICLE Open Access Genotypic variation in biomass allocation in response to field drought has a greater affect on

More information

Statistical issues in QTL mapping in mice

Statistical issues in QTL mapping in mice Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping

More information

Lecture 8 Genomic Selection

Lecture 8 Genomic Selection Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

QTL model selection: key players

QTL model selection: key players Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:

More information

Gene mapping in model organisms

Gene mapping in model organisms Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2

More information

Evolutionary Ecology of Senecio

Evolutionary Ecology of Senecio Evolutionary Ecology of Senecio Evolutionary ecology The primary focus of evolutionary ecology is to identify and understand the evolution of key traits, by which plants are adapted to their environment,

More information

Overview. Background

Overview. Background Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems

More information

Recent advances in statistical methods for DNA-based prediction of complex traits

Recent advances in statistical methods for DNA-based prediction of complex traits Recent advances in statistical methods for DNA-based prediction of complex traits Mintu Nath Biomathematics & Statistics Scotland, Edinburgh 1 Outline Background Population genetics Animal model Methodology

More information

Linear regression methods

Linear regression methods Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response

More information

Limited dimensionality of genomic information and effective population size

Limited dimensionality of genomic information and effective population size Limited dimensionality of genomic information and effective population size Ivan Pocrnić 1, D.A.L. Lourenco 1, Y. Masuda 1, A. Legarra 2 & I. Misztal 1 1 University of Georgia, USA 2 INRA, France WCGALP,

More information

GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL)

GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL) GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL) Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México September, 2014. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Major Genes, Polygenes, and

Major Genes, Polygenes, and Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative

More information

Multidimensional heritability analysis of neuroanatomical shape. Jingwei Li

Multidimensional heritability analysis of neuroanatomical shape. Jingwei Li Multidimensional heritability analysis of neuroanatomical shape Jingwei Li Brain Imaging Genetics Genetic Variation Behavior Cognition Neuroanatomy Brain Imaging Genetics Genetic Variation Neuroanatomy

More information

Quantile based Permutation Thresholds for QTL Hotspots. Brian S Yandell and Elias Chaibub Neto 17 March 2012

Quantile based Permutation Thresholds for QTL Hotspots. Brian S Yandell and Elias Chaibub Neto 17 March 2012 Quantile based Permutation Thresholds for QTL Hotspots Brian S Yandell and Elias Chaibub Neto 17 March 2012 2012 Yandell 1 Fisher on inference We may at once admit that any inference from the particular

More information

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012 Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O

More information

arxiv: v1 [stat.me] 10 Jun 2018

arxiv: v1 [stat.me] 10 Jun 2018 Lost in translation: On the impact of data coding on penalized regression with interactions arxiv:1806.03729v1 [stat.me] 10 Jun 2018 Johannes W R Martini 1,2 Francisco Rosales 3 Ngoc-Thuy Ha 2 Thomas Kneib

More information

T h e C S E T I P r o j e c t

T h e C S E T I P r o j e c t T h e P r o j e c t T H E P R O J E C T T A B L E O F C O N T E N T S A r t i c l e P a g e C o m p r e h e n s i v e A s s es s m e n t o f t h e U F O / E T I P h e n o m e n o n M a y 1 9 9 1 1 E T

More information

Managing segregating populations

Managing segregating populations Managing segregating populations Aim of the module At the end of the module, we should be able to: Apply the general principles of managing segregating populations generated from parental crossing; Describe

More information

Régression en grande dimension et épistasie par blocs pour les études d association

Régression en grande dimension et épistasie par blocs pour les études d association Régression en grande dimension et épistasie par blocs pour les études d association V. Stanislas, C. Dalmasso, C. Ambroise Laboratoire de Mathématiques et Modélisation d Évry "Statistique et Génome" 1

More information

=, v T =(e f ) e f B =

=, v T =(e f ) e f B = A Quick Refresher of Basic Matrix Algebra Matrices and vectors and given in boldface type Usually, uppercase is a matrix, lower case a vector (a matrix with only one row or column) a b e A, v c d f The

More information

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability

More information

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations

A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

Genotype-Environment Effects Analysis Using Bayesian Networks

Genotype-Environment Effects Analysis Using Bayesian Networks Genotype-Environment Effects Analysis Using Bayesian Networks 1, Alison Bentley 2 and Ian Mackay 2 1 scutari@stats.ox.ac.uk Department of Statistics 2 National Institute for Agricultural Botany (NIAB)

More information

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

SoyBase, the USDA-ARS Soybean Genetics and Genomics Database

SoyBase, the USDA-ARS Soybean Genetics and Genomics Database SoyBase, the USDA-ARS Soybean Genetics and Genomics Database David Grant Victoria Carollo Blake Steven B. Cannon Kevin Feeley Rex T. Nelson Nathan Weeks SoyBase Site Map and Navigation Video Tutorials:

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]

More information

Lecture 4: Allelic Effects and Genetic Variances. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

Lecture 4: Allelic Effects and Genetic Variances. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 Lecture 4: Allelic Effects and Genetic Variances Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 1 Basic model of Quantitative Genetics Phenotypic value -- we will occasionally also use

More information

RFLP facilitated analysis of tiller and leaf angles in rice (Oryza sativa L.)

RFLP facilitated analysis of tiller and leaf angles in rice (Oryza sativa L.) Euphytica 109: 79 84, 1999. 1999 Kluwer Academic Publishers. Printed in the Netherlands. 79 RFLP facilitated analysis of tiller and leaf angles in rice (Oryza sativa L.) Zhikang Li 1,2,3, Andrew H. Paterson

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4

More information

Nature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions.

Nature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions. Supplementary Figure 1 The phenotypes of PI 159925, BR121, and Harosoy under short-day conditions. (a) Plant height. (b) Number of branches. (c) Average internode length. (d) Number of nodes. (e) Pods

More information

Chapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature )

Chapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature ) Chapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature ) Chapter 2 Section 2: how traits are passed from the parents

More information

Introduction to population genetics & evolution

Introduction to population genetics & evolution Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics

More information

Efficient Bayesian mixed model analysis increases association power in large cohorts

Efficient Bayesian mixed model analysis increases association power in large cohorts Linear regression Existing mixed model methods New method: BOLT-LMM Time O(MM) O(MN 2 ) O MN 1.5 Corrects for confounding? Power Efficient Bayesian mixed model analysis increases association power in large

More information

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Quantitative Genetics The analysis of traits whose variation is determined by both a

More information

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular

More information

Multivariate Analysis

Multivariate Analysis Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.501.01 Lecture11: Quantitative Genomics II Jason Mezey jgm45@cornell.edu March 7, 019 (Th) 10:10-11:5 Announcements Homework #5 will be posted by

More information

Latent Variable Methods for the Analysis of Genomic Data

Latent Variable Methods for the Analysis of Genomic Data John D. Storey Center for Statistics and Machine Learning & Lewis-Sigler Institute for Integrative Genomics Latent Variable Methods for the Analysis of Genomic Data http://genomine.org/talks/ Data m variables

More information

Quantitative Genetics & Evolutionary Genetics

Quantitative Genetics & Evolutionary Genetics Quantitative Genetics & Evolutionary Genetics (CHAPTER 24 & 26- Brooker Text) May 14, 2007 BIO 184 Dr. Tom Peavy Quantitative genetics (the study of traits that can be described numerically) is important

More information

Mixed-Models. version 30 October 2011

Mixed-Models. version 30 October 2011 Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

6. Regularized linear regression

6. Regularized linear regression Foundations of Machine Learning École Centrale Paris Fall 2015 6. Regularized linear regression Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr

More information

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition Lecture 3 Introduction on Quantitative Genetics: I Fisher s Variance Decomposition Bruce Walsh. Aug 004. Royal Veterinary and Agricultural University, Denmark Contribution of a Locus to the Phenotypic

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

MIXED MODELS THE GENERAL MIXED MODEL

MIXED MODELS THE GENERAL MIXED MODEL MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted

More information

7.2: Natural Selection and Artificial Selection pg

7.2: Natural Selection and Artificial Selection pg 7.2: Natural Selection and Artificial Selection pg. 305-311 Key Terms: natural selection, selective pressure, fitness, artificial selection, biotechnology, and monoculture. Natural Selection is the process

More information

Genotyping strategy and reference population

Genotyping strategy and reference population GS cattle workshop Genotyping strategy and reference population Effect of size of reference group (Esa Mäntysaari, MTT) Effect of adding females to the reference population (Minna Koivula, MTT) Value of

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Eiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6

Eiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6 Yamamoto et al. BMC Genetics 2014, 15:50 METHODOLOGY ARTICLE Open Access Effect of advanced intercrossing on genome structure and on the power to detect linked quantitative trait loci in a multi-parent

More information

Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials

Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials Biostatistics (2013), pp. 1 31 doi:10.1093/biostatistics/kxt006 Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials XINYI LIN, SEUNGGUEN

More information

Lecture 2: Linear and Mixed Models

Lecture 2: Linear and Mixed Models Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =

More information

A recipe for the perfect salsa tomato

A recipe for the perfect salsa tomato The National Association of Plant Breeders in partnership with the Plant Breeding and Genomics Community of Practice presents A recipe for the perfect salsa tomato David Francis, The Ohio State University

More information