Multi-environment GWAS and genomic prediction: penalized regression and mixed model approaches
|
|
- Kathleen Lloyd
- 5 years ago
- Views:
Transcription
1 Multi-environment GWAS and genomic prediction: penalized regression and mixed model approaches Montpellier, 9 June 2015 Fred van Eeuwijk & Willem Kruijer Daniela Bustos Korts, Marcos Malosetti, & Martin Boer WUR-Biometris, Wageningen, The Netherlands
2 WU contribution to EU-DROPS Development of multi-environment GWAS and genomic prediction pipeline with facilities for modelling GxE/QTLxE Incorporation of genetic (phenotyping platforms) and environmental characterizations (envirotyping) in multienvironment models; Predictions for new genotypes in new environments Generalizations to multi-trait multi-environment models Additional constraints on genetic correlations between traits and environments Network models / Crop growth models
3 Two-way fixed genotype x environment interaction ANOVA y ij = μ + g i + e j + ge ij + ε ij μ ij = μ + g i + e j + ge ij ; ε ij ~N(0, σ 2 ) FW ge ij = β i E j + δ ij A AMMI ge ij = a=1 r ai c ai + δ ij P PCA/ GGE/ SREG g ij = p=1 r pi c pi + δ ij F F Factorial regression ge ij = f=1 β fi z fj + δ ij ; ge ij = f=1 x fi α fj + δ ij Grouping ge ij = rc kl + δ i(k)j(l)
4 Multi-environment QTL mapping (linkage & LD) & genomic prediction y ij = μ j + Q q=1 x iq α jq + G ij + ε ij Phenotype = Environmental mean + Environment specific QTLs + Environment specific polygenic genetic effect + Environment specific error {G ij }~MVN(0, Σ); {ε ij }~MVN(0, R) Σ = Σ G Σ E ; Σ G = A for assocation mapping and Σ G = I for QTL mapping Σ E = ΓΓ + Ψ kinship, pedigree, markers (A = WW ) group unstructured, smoothed, diagonalised factor analytic group unstructured, environmental kinship (Jarquin et al., 2014) Environment specific QTLs linear function of environmental covariates y ij = μ j + Q x iq q=1 γ q + δ q z j + G ij + ε ij
5 So m e r elevant p ap er s 4th Annual meeting 25th 29th March 2014, Aachen, Germany
6 Cit at io ns t o Gx E m et ho d o logy p ap er s: Daniela Bu st o s 4th Annual meeting 25th 29th March 2014, Aachen, Germany
7 Ignacio Romagosa Steptoe X Morex: QTLxE for grain yield QTL effects Ppd-H1 C3P51 C6P65 QTL effects ID91 ID91 MAN92 MAN92 MTd91 ID92 MTd91 MTd92 MTd92 MTi91 MTi92 MTi91 SC03 MTi92 SC05 SC03 SKs92 SC05 WA91 SKs92 WA92 WA91 WA Cross-over interaction: 1 Depending 2 3on the site 4 the contribution 5 of 6 the Morex 7 allele can be either positive or negative. Chromosomes Chromosomes There is a clear genetic control of heading, but depending on the meteorological conditions through grain filling it could be either positive or negative to have early heading.
8 L effects QTL effects L effects ID91 QTL effects ID91 ID91 ID91 MAN92 ID92 MAN92 ID92 MAN92 MTd91 MAN92 MTd91 MTd91 MTd92 MTd91 MTd92 MTd92 MTi91 MTd92 MTi91 MTi91 MTi92 MTi91 MTi92 MTi92 SC03 MTi92 SC03 SC03 SC05 SC03 SC05 SKs92 SC05 SKs92 SC05 SKs92 WA91 SKs92 WA91 WA91 WA92 WA91 WA92 WA92 WA ,80 0,60 0,40 0,20 0,00-0,20-0,40-0,60 Steptoe x Morex: Factorial Regression Yield QTL effect (t/ha): Ppd-H1-0,80 8,0 10,0 12,0 14,0 16,0 18,0 Tdif_H Ppd-H1 Morex allele (yellow-red) non-responsive at long photoperiod better adapted to intermediate Tdif_H Sco05 Sco Chromosomes Chromosomes 5 5 SK92 WA MAN MTi92 WA92 MTd92 MTi91 MTd ID91 ID92 R 2 = 0.784; p=0.0002
9 QTL-effect QTLxE 0.20 NP WN94 NP AD94 SV94 GC95 MR CI95 YK YA PR SV average min. temperature period 4
10 Smart tools for Prediction and Improvement of Crop Yield RIL population (n=149) Parents: Yolo Wonder, CM 334 Greenhouse experiments, two locations x two seasons Netherlands (Wageningen) & Spain (Almeria) Crop measurements Plant weight (stem, leaves, fruit; initial and final) Number of internodes Leaf area (initial and final) Fruit harvest (number, weight) Environment Temperature Radiation
11 Multi-trait multi-environment QTL analysis Environments Environments -log10(p -log10(p) Traits x Environments QT L e ffe c ts Ax l.nl 1 Ax l.nl 2 Ax l.sp1 Ax l.sp2 DW F.NL 1 DW F.NL 2 DW F.SP1 DW F.SP2 DW L.NL 1 DW L.NL 2 DW L.SP1 DW L.SP2 DW S.NL 1 DW S.NL 2 DW S.SP1 DW S.SP2 DW V.NL 1 DW V.NL 2 DW V.SP1 DW V.SP2 INL.NL 1 INL.NL 2 INL.SP1 INL.SP2 L AI.NL 1 L AI.NL 2 L AI.SP1 L AI.SP2 L UE.NL 1 L UE.NL 2 L UE.SP1 L UE.SP2 NF. NL 1 NF. NL 2 NF. SP1 NF. SP2 NI.NL 1 NI.NL 2 NI.S P1 NI.S P2 NL E.NL 1 NL E.NL 2 NL E.SP1 NL E.SP2 p t_ frt.nl 1 p t_ frt.nl 2 p t_ frt.sp 1 p t_ frt.sp 2 p t_ l e a f.nl 1 p t_ l e a f.nl 2 p t_ l e a f.s P1 p t_ l e a f.s P2 SL. NL 1 SL. NL 2 SL. SP1 SL. SP2 SL A.NL 1 SL A.NL 2 SL A.SP2 SL A.SP Test profile and additive effects: MTME QT L e ffe c ts Ax l.nl 1 Ax l.nl 2 Ax l.sp1 Ax l.sp2 DW F.NL 1 DW F.NL 2 DW F.SP1 DW F.SP2 DW L.NL 1 DW L.NL 2 DW L.SP1 DW L.SP2 DW S.NL 1 DW S.NL 2 DW S.SP1 DW S.SP2 DW V.NL 1 DW V.NL 2 DW V.SP1 DW V.SP2 INL.NL 1 INL.NL 2 INL.SP1 INL.SP2 L AI.NL 1 L AI.NL 2 L AI.SP1 L AI.SP2 L UE.NL 1 L UE.NL 2 L UE.SP1 L UE.SP2 NF. NL 1 NF. NL 2 NF. SP1 NF. SP2 NI.NL 1 NI.NL 2 NI.S P1 NI.S P2 NL E.NL 1 NL E.NL 2 NL E.SP1 NL E.SP2 p t_ frt.nl 1 p t_ frt.nl 2 p t_ frt.sp 1 p t_ frt.sp 2 p t_ l e a f.nl 1 p t_ l e a f.nl 2 p t_ l e a f.s P1 p t_ l e a f.s P2 SL. NL 1 SL. NL 2 SL. SP1 SL. SP2 SL A.NL 1 SL A.NL 2 SL A.SP1 SL A.SP Chromosomes Chromosomes How useful is multi-trait GWAS? 11 12
12 Frequency Frequency Frequency Number of QTLs and explained variance when increasing model complexity, without going to genomic prediction Figure 7: Histogram of Explained Variance by individual QTLs as detected by ME, MT and MTME analyses. MTME produced far more QTLs than ME and MT but many of the extra QTLs from MTME are of small effects. 8a: Histogram of Explained Variance8b: Histogram of Explained Variancc: Histogram of Explained Variance ME MT MTME Percentage of Explained Variance Percentage Explained Variance by significant QPercentage Explained Variance by significant Percentage Explained Variance by significant Multi-variate (association) analysis is more powerful, especially in the case when not all traits are associated with the genetic variant being tested. Stephens, PLOS One, 2013
13 EU-DROPS example data 253 genotypes 23 environments = management x location x year 15 environmental characterizations/ indices 333k SNPs (S. Négro, S. Nicolas, A. Charcosset) example trait: grain yield
14 Multi-environment GWAS in DROPS At start DROPS, state of the art multi-trait GWAS (GEMMA): fast GWAS up till 10 traits Improvements for DROPS: GWAS up till 100 traits Advantages multi-trait GWAS: more power, higher resolution Allows testing of contrasts for QTLxE Strategies in GWAS modelling and computation Structured models for correlations between genotypes and between environments Increased speed via diagonalization (Zhou & Stephens, 2014), 12h for 253 genotypes, 23 environments and 333k SNPs Improved convergence by EM algorithms (Dahl et al., 2014)
15 Grain yield y ij = μ j + x i α j + G ij + ε ij for each SNP Test H 0 : α 1 =α 2 = = α 23 =0 When H 0 is rejected regress α 1,... α 23 on environmental characterizations/contrasts/indices
16 Regressions of QTL effects on contrasts/indices/characterizations
17 Multi-environment genomic prediction for DROPS Implemented mixed model approach by Dahl et al. (2014). Fast: O(np 2 +p 3 ) instead of O(n 3 p 2 ) QTLs can be included in fixed part model Predictions for new environments via creation environmental relationship matrix on the basis of environmental characterizations Problems Difficult to find appropriate way of including environmental characterizations in relationship matrix. Jarquin et al (2013) uses equal weights, and include genotypic and environmental main effects, so only modelling GE ij term and not G ij. Genotypic main effect, G i : include explicit genotypic main effect or model with main effect QTLs? In the latter case, how many SNPs to include in fixed part? For random part, should all SNPs have equal variance? See Bayes RC and multi-blup approaches for alternatives For random part, should all SNPs have equal weight?
18 Genomic prediction by penalized regression Elastic net: combination of Lasso (L1 penalty) and ridge regression (L2 penalty) Problem 333k SNPs x 15 environmental indices produce more than 5 million interaction terms SNPs are chosen on basis of single environment GWAS scans + to correct for kinship/population structure 10 marker based principal components = about 500 genetic predictors m y ij = X G i,l β G q l=1 l + X E E j,k β k m,q k=1 + l=1,k=1 (X i,l G X j,k E )β lk GE + ε ij Different penalties on β l G, β k E, and β lk GE Accuracies for leave one environment out:
19 Some conclusions for DROPS multi-trait GWAS and genomic prediction Algorithms for fitting multi-trait mixed models developed and implemented Successfully tested for GWAS (power & resolution) Faster than ASReml-R/Genstat, more traits than GEMMA, no convergence problems Environment specific QTL effects regressed on environmental characterizations Further work necessary for defining environmental relations in multi-environment genomic prediction Incorporation of phenotyping platform data
20 Extra slides 1 Fred van Eeuwijk
21 Learning from Nature Genetic architecture underlying resistance to biotic and abiotic stresses in Arabidopsis: A multi-trait genome-wide association mapping approach Fred van Eeuwijk, Marcel Dicke Silvia Coolen, Nelson Davila Olivas, Pingping Huang, Karen Kloth, Manus Thoen Willem Kruijer, Joost van Heerwaarden
22 Learning From Nature Introduction Material Plants exposed to biotic and abiotic stresses, including combined stresses Biotic drought, salt, osmotic, heat Abiotic parasitic plant, phloem feeding aphid, phloem feeding whitefly, cell-content feeding thrips, leaf chewing caterpillar, necrotrophic fungus Combined fungus & caterpillar, drought & fungus, drought & caterpillar, caterpillar and osmotic 350 Arabidopsis accessions phenotyped in multiple experiments 250k SNPs
23 Methods Multi-trait mixed model Genetic correlations Bivariate mixed models for pairs of traits with structuring of random genotype effects by kinship calculated from full set of SNPs = covariance for genomic breeding values QTLs/ candidate genes Multi-trait mixed model GWAS Hypothesis testing within GWAS model context biotic abiotic stresses
24 Genomic (upper right) and phenotypic correlations (lower left) Phenotypic correlations low, genetic correlations show an enriched signal
25 Genomic correlations between traits (bivariate G-BLUP) Structuring of trait/ environmental correlations y it = μ t + G it + ε it ; G it = M m x im a tm
26 Multi-trait GWAS y it = μ t + Q m=1 x im a tm + G it + ε ij
27 Consistent QTLs/ genes y it = μ t + with a tm =a m Q m=1 x im a tm + G it + ε ij
28 Biotic vs abiotic stress y it = μ t + Q m=1 x im a tm with a tm =a t,abiotic or a tm =a t,biotic + G it + ε ij
29 Extra slides 2 Fred van Eeuwijk
30 Crop growth model pepper (integration over time) yield = f(hi, LAI-rate, LUE; Temperature, Light) + error y ij target = f yi components ; z j dt + e ij t f y ij = 1 exp K i LAI i,j,t I j,t t=t 0 LUE i,j FTF i 1 W i FDMC i + ε i,j T j T FTF Genomic prediction for yield from inserting genomic predictions for components into crop growth model not better than genomic prediction for yield from yield itself
31 Accuracy yield predicted from QTLs/ GP for components, within environment (CV) Environment SP1 Strategy QTL_ ST BVS BRR BLA BLV QTL_ MT Direct CGM Direct SP2 CGM
32 Accuracy yield from QTLs/GP for components, across environments (CV) Environment Strategy QTL_ ST BVS BRR BLA BLV QTL_ MT SP1::SP2 CGM SP2::SP1 CGM
33 Causal structure CGM and reconstructions in Spain 1 and 2 Sparse inverse environmental kinship Crop growth model and causal reconstructions of networks for yield and its components define inverse environmental kinship/correlation matrix (compare pedigree for genotypes). The result is a sparse representation of the inverse of the environmental kinship useful for speeding up multi-environment GWAS/ genomic prediction (Willem Kruijer)
TASK 6.3 Modelling and data analysis support
Wheat and barley Legacy for Breeding Improvement TASK 6.3 Modelling and data analysis support FP7 European Project Task 6.3: How can statistical models contribute to pre-breeding? Daniela Bustos-Korts
More informationEvolution of phenotypic traits
Quantitative genetics Evolution of phenotypic traits Very few phenotypic traits are controlled by one locus, as in our previous discussion of genetics and evolution Quantitative genetics considers characters
More informationSome models of genomic selection
Munich, December 2013 What is the talk about? Barley! Steptoe x Morex barley mapping population Steptoe x Morex barley mapping population genotyping from Close at al., 2009 and phenotyping from cite http://wheat.pw.usda.gov/ggpages/sxm/
More informationLecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013
Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic
More information(Genome-wide) association analysis
(Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationLecture 9. QTL Mapping 2: Outbred Populations
Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred
More informationA mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding
Professur Pflanzenzüchtung Professur Pflanzenzüchtung A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Jens Léon 4. November 2014, Oulu Workshop
More informationUvA-DARE (Digital Academic Repository)
UvA-DARE (Digital Academic Repository) Genetic architecture of plant stress resistance: multi-trait genome-wide association mapping Thoen, M.P.M.; Davila Olivas, N.H.; Kloth, K.J.; Coolen, S.; Huang, P.-P.;
More informationGBLUP and G matrices 1
GBLUP and G matrices 1 GBLUP from SNP-BLUP We have defined breeding values as sum of SNP effects:! = #$ To refer breeding values to an average value of 0, we adopt the centered coding for genotypes described
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationG E INTERACTION USING JMP: AN OVERVIEW
G E INTERACTION USING JMP: AN OVERVIEW Sukanta Dash I.A.S.R.I., Library Avenue, New Delhi-110012 sukanta@iasri.res.in 1. Introduction Genotype Environment interaction (G E) is a common phenomenon in agricultural
More informationCS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS
CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More informationLecture 9 GxE Mixed Models. Lucia Gutierrez Tucson Winter Institute
Lecture 9 GxE Mixed Models Lucia Gutierrez Tucson Winter Institute 1 Genotypic Means GENOTYPIC MEANS: y ik = G i E GE i ε ik The environment includes non-genetic factors that affect the phenotype, and
More informationPrinciples of QTL Mapping. M.Imtiaz
Principles of QTL Mapping M.Imtiaz Introduction Definitions of terminology Reasons for QTL mapping Principles of QTL mapping Requirements For QTL Mapping Demonstration with experimental data Merit of QTL
More informationQTL Mapping I: Overview and using Inbred Lines
QTL Mapping I: Overview and using Inbred Lines Key idea: Looking for marker-trait associations in collections of relatives If (say) the mean trait value for marker genotype MM is statisically different
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationSUPPLEMENTARY INFORMATION
doi:10.1038/nature25973 Power Simulations We performed extensive power simulations to demonstrate that the analyses carried out in our study are well powered. Our simulations indicate very high power for
More informationMethods for Cryptic Structure. Methods for Cryptic Structure
Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases
More informationWheat Genetics and Molecular Genetics: Past and Future. Graham Moore
Wheat Genetics and Molecular Genetics: Past and Future Graham Moore 1960s onwards Wheat traits genetically dissected Chromosome pairing and exchange (Ph1) Height (Rht) Vernalisation (Vrn1) Photoperiodism
More informationVariance Component Models for Quantitative Traits. Biostatistics 666
Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond
More informationLecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013
Lecture 28: BLUP and Genomic Selection Bruce Walsh lecture notes Synbreed course version 11 July 2013 1 BLUP Selection The idea behind BLUP selection is very straightforward: An appropriate mixed-model
More informationLinear Regression. Volker Tresp 2018
Linear Regression Volker Tresp 2018 1 Learning Machine: The Linear Model / ADALINE As with the Perceptron we start with an activation functions that is a linearly weighted sum of the inputs h = M j=0 w
More informationInvestigations into biomass yield in perennial ryegrass (Lolium perenne L.)
Investigations into biomass yield in perennial ryegrass (Lolium perenne L.) Ulrike Anhalt 1,2, Pat Heslop-Harrison 2, Céline Tomaszewski 1,2, Hans-Peter Piepho 3, Oliver Fiehn 4 and Susanne Barth 1 1 2
More informationMobilizing genetic resources and optimizing breeding programs DO NOT COPY. J.-F. Rami UMR AGAP
Mobilizing genetic resources and optimizing breeding programs J.-F. Rami UMR AGAP Genetic Diversity Outline characterization of ex situ Genetic Diversity dynamics of in situ diversity diversity and society
More informationHeritability estimation in modern genetics and connections to some new results for quadratic forms in statistics
Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),
More informationINTRODUCTION TO ANIMAL BREEDING. Lecture Nr 3. The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs
INTRODUCTION TO ANIMAL BREEDING Lecture Nr 3 The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs Etienne Verrier INA Paris-Grignon, Animal Sciences Department
More informationSupplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control
Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model
More informationThe Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies
The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu August 5, 2014 Ian Barnett
More informationTutorial Session 2. MCMC for the analysis of genetic data on pedigrees:
MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April
More informationGWAS IV: Bayesian linear (variance component) models
GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian
More informationDetecting selection from differentiation between populations: the FLK and hapflk approach.
Detecting selection from differentiation between populations: the FLK and hapflk approach. Bertrand Servin bservin@toulouse.inra.fr Maria-Ines Fariello, Simon Boitard, Claude Chevalet, Magali SanCristobal,
More informationThe Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies
The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu June 24, 2014 Ian Barnett
More informationExpression Data Exploration: Association, Patterns, Factors & Regression Modelling
Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation
More information. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)
Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,
More informationMultiple QTL mapping
Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power
More informationGenotypic variation in biomass allocation in response to field drought has a greater affect on yield than gas exchange or phenology
Edwards et al. BMC Plant Biology (2016) 16:185 DOI 10.1186/s12870-016-0876-3 RESEARCH ARTICLE Open Access Genotypic variation in biomass allocation in response to field drought has a greater affect on
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationLecture 8 Genomic Selection
Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationQTL model selection: key players
Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More informationEvolutionary Ecology of Senecio
Evolutionary Ecology of Senecio Evolutionary ecology The primary focus of evolutionary ecology is to identify and understand the evolution of key traits, by which plants are adapted to their environment,
More informationOverview. Background
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationRecent advances in statistical methods for DNA-based prediction of complex traits
Recent advances in statistical methods for DNA-based prediction of complex traits Mintu Nath Biomathematics & Statistics Scotland, Edinburgh 1 Outline Background Population genetics Animal model Methodology
More informationLinear regression methods
Linear regression methods Most of our intuition about statistical methods stem from linear regression. For observations i = 1,..., n, the model is Y i = p X ij β j + ε i, j=1 where Y i is the response
More informationLimited dimensionality of genomic information and effective population size
Limited dimensionality of genomic information and effective population size Ivan Pocrnić 1, D.A.L. Lourenco 1, Y. Masuda 1, A. Legarra 2 & I. Misztal 1 1 University of Georgia, USA 2 INRA, France WCGALP,
More informationGENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL)
GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL) Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México September, 2014. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions
More informationMultiple Change-Point Detection and Analysis of Chromosome Copy Number Variations
Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem
More informationMajor Genes, Polygenes, and
Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative
More informationMultidimensional heritability analysis of neuroanatomical shape. Jingwei Li
Multidimensional heritability analysis of neuroanatomical shape Jingwei Li Brain Imaging Genetics Genetic Variation Behavior Cognition Neuroanatomy Brain Imaging Genetics Genetic Variation Neuroanatomy
More informationQuantile based Permutation Thresholds for QTL Hotspots. Brian S Yandell and Elias Chaibub Neto 17 March 2012
Quantile based Permutation Thresholds for QTL Hotspots Brian S Yandell and Elias Chaibub Neto 17 March 2012 2012 Yandell 1 Fisher on inference We may at once admit that any inference from the particular
More informationMixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012
Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O
More informationarxiv: v1 [stat.me] 10 Jun 2018
Lost in translation: On the impact of data coding on penalized regression with interactions arxiv:1806.03729v1 [stat.me] 10 Jun 2018 Johannes W R Martini 1,2 Francisco Rosales 3 Ngoc-Thuy Ha 2 Thomas Kneib
More informationT h e C S E T I P r o j e c t
T h e P r o j e c t T H E P R O J E C T T A B L E O F C O N T E N T S A r t i c l e P a g e C o m p r e h e n s i v e A s s es s m e n t o f t h e U F O / E T I P h e n o m e n o n M a y 1 9 9 1 1 E T
More informationManaging segregating populations
Managing segregating populations Aim of the module At the end of the module, we should be able to: Apply the general principles of managing segregating populations generated from parental crossing; Describe
More informationRégression en grande dimension et épistasie par blocs pour les études d association
Régression en grande dimension et épistasie par blocs pour les études d association V. Stanislas, C. Dalmasso, C. Ambroise Laboratoire de Mathématiques et Modélisation d Évry "Statistique et Génome" 1
More information=, v T =(e f ) e f B =
A Quick Refresher of Basic Matrix Algebra Matrices and vectors and given in boldface type Usually, uppercase is a matrix, lower case a vector (a matrix with only one row or column) a b e A, v c d f The
More informationHERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)
BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability
More informationA General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations
A General Framework for Variable Selection in Linear Mixed Models with Applications to Genetic Studies with Structured Populations Joint work with Karim Oualkacha (UQÀM), Yi Yang (McGill), Celia Greenwood
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationGenotype-Environment Effects Analysis Using Bayesian Networks
Genotype-Environment Effects Analysis Using Bayesian Networks 1, Alison Bentley 2 and Ian Mackay 2 1 scutari@stats.ox.ac.uk Department of Statistics 2 National Institute for Agricultural Botany (NIAB)
More informationBTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014
BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationExpression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia
Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.
More informationSoyBase, the USDA-ARS Soybean Genetics and Genomics Database
SoyBase, the USDA-ARS Soybean Genetics and Genomics Database David Grant Victoria Carollo Blake Steven B. Cannon Kevin Feeley Rex T. Nelson Nathan Weeks SoyBase Site Map and Navigation Video Tutorials:
More informationSemi-Penalized Inference with Direct FDR Control
Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationLecture 4: Allelic Effects and Genetic Variances. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013
Lecture 4: Allelic Effects and Genetic Variances Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013 1 Basic model of Quantitative Genetics Phenotypic value -- we will occasionally also use
More informationRFLP facilitated analysis of tiller and leaf angles in rice (Oryza sativa L.)
Euphytica 109: 79 84, 1999. 1999 Kluwer Academic Publishers. Printed in the Netherlands. 79 RFLP facilitated analysis of tiller and leaf angles in rice (Oryza sativa L.) Zhikang Li 1,2,3, Andrew H. Paterson
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4
More informationNature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions.
Supplementary Figure 1 The phenotypes of PI 159925, BR121, and Harosoy under short-day conditions. (a) Plant height. (b) Number of branches. (c) Average internode length. (d) Number of nodes. (e) Pods
More informationChapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature )
Chapter 2 Section 1 discussed the effect of the environment on the phenotype of individuals light, population ratio, type of soil, temperature ) Chapter 2 Section 2: how traits are passed from the parents
More informationIntroduction to population genetics & evolution
Introduction to population genetics & evolution Course Organization Exam dates: Feb 19 March 1st Has everybody registered? Did you get the email with the exam schedule Summer seminar: Hot topics in Bioinformatics
More informationEfficient Bayesian mixed model analysis increases association power in large cohorts
Linear regression Existing mixed model methods New method: BOLT-LMM Time O(MM) O(MN 2 ) O MN 1.5 Corrects for confounding? Power Efficient Bayesian mixed model analysis increases association power in large
More informationLecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011
Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Quantitative Genetics The analysis of traits whose variation is determined by both a
More informationUNIT 8 BIOLOGY: Meiosis and Heredity Page 148
UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular
More informationMultivariate Analysis
Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.501.01 Lecture11: Quantitative Genomics II Jason Mezey jgm45@cornell.edu March 7, 019 (Th) 10:10-11:5 Announcements Homework #5 will be posted by
More informationLatent Variable Methods for the Analysis of Genomic Data
John D. Storey Center for Statistics and Machine Learning & Lewis-Sigler Institute for Integrative Genomics Latent Variable Methods for the Analysis of Genomic Data http://genomine.org/talks/ Data m variables
More informationQuantitative Genetics & Evolutionary Genetics
Quantitative Genetics & Evolutionary Genetics (CHAPTER 24 & 26- Brooker Text) May 14, 2007 BIO 184 Dr. Tom Peavy Quantitative genetics (the study of traits that can be described numerically) is important
More informationMixed-Models. version 30 October 2011
Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More information6. Regularized linear regression
Foundations of Machine Learning École Centrale Paris Fall 2015 6. Regularized linear regression Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr
More informationLecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition
Lecture 3 Introduction on Quantitative Genetics: I Fisher s Variance Decomposition Bruce Walsh. Aug 004. Royal Veterinary and Agricultural University, Denmark Contribution of a Locus to the Phenotypic
More informationNormal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,
Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability
More informationMIXED MODELS THE GENERAL MIXED MODEL
MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted
More information7.2: Natural Selection and Artificial Selection pg
7.2: Natural Selection and Artificial Selection pg. 305-311 Key Terms: natural selection, selective pressure, fitness, artificial selection, biotechnology, and monoculture. Natural Selection is the process
More informationGenotyping strategy and reference population
GS cattle workshop Genotyping strategy and reference population Effect of size of reference group (Esa Mäntysaari, MTT) Effect of adding females to the reference population (Minna Koivula, MTT) Value of
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationEiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6
Yamamoto et al. BMC Genetics 2014, 15:50 METHODOLOGY ARTICLE Open Access Effect of advanced intercrossing on genome structure and on the power to detect linked quantitative trait loci in a multi-parent
More informationTest for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials
Biostatistics (2013), pp. 1 31 doi:10.1093/biostatistics/kxt006 Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials XINYI LIN, SEUNGGUEN
More informationLecture 2: Linear and Mixed Models
Lecture 2: Linear and Mixed Models Bruce Walsh lecture notes Introduction to Mixed Models SISG, Seattle 18 20 July 2018 1 Quick Review of the Major Points The general linear model can be written as y =
More informationA recipe for the perfect salsa tomato
The National Association of Plant Breeders in partnership with the Plant Breeding and Genomics Community of Practice presents A recipe for the perfect salsa tomato David Francis, The Ohio State University
More information