GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL)
|
|
- Cameron Chase
- 5 years ago
- Views:
Transcription
1 GENOMIC SELECTION WORKSHOP: Hands on Practical Sessions (BL) Paulino Pérez 1 José Crossa 2 1 ColPos-México 2 CIMMyT-México September, SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 1/29
2 Contents 1 General comments 2 LASSO 3 Application examples 4 Extension of BL to include infinitesimal effect SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 2/29
3 General comments General comments The regression linear model is given by, where e i N(0, σ 2 e), i = 1,..., n. y i = µ + p x ij β j + e i, (1) j=1 The key Idea is obtain estimates for β and then obtain GEBVs. ˆβ can be obtained using penalized regression methods, for example ridge regression (G-BLUP). Now we review another penalized regression method called LASSO=Least Angle and Shrinkage Operator. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 3/29
4 LASSO LASSO In LASSO estimates for β are obtained by minimizing the augmented sum of squares: { min (y X j β j ) (y X j β j ) + λ } β j, (2) β where λ 0 is a regularization parameter that controls the trade-offs between goodness of fit (measured with sum of squares of error, SCE) and model complexity (measured with β 2 j ) Notes: 1 The value for λ can be fixed by using cross-validation methods. 2 Some of the entries in β take the value of 0, so LASSO can be useful as a variable selection method. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 4/29
5 Continued... LASSO Problems with LASSO: 1 At most, n entries in β can be different from 0. This is problematic in GS, where usually n << p (curse of dimensionality). 2 It can be difficult to select the value for λ. 3 It is difficult to obtain estimates for σ 2 e. 4 It is difficult to obtain confidence intervals for β j, j = 1,..., p. Alternatives: Bayesian estimation methods... SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 5/29
6 Bayesian LASSO LASSO SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 6/29
7 Continued... LASSO SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 7/29
8 Continue... LASSO SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 8/29
9 Continued... LASSO Density function In ridge regression, p(β j σβ) 2 = N(β j 0, σβ), 2 j = 1,..., p In LASSO p(β j σe, 2 λ) = DE(β j 0, λ/σe) β Figure 1: Prior in BL and in BRR SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 9/29
10 LASSO Join posterior distribution of model unknowns The join distribution for model unknowns is given by: p(β, σ 2 e, µ data) = n N(y i µ+ p x ij β j, σe) 2 p(β j ω) p(σe) p(µ) p(λ 2 2 ), i=1 where p(µ) 1, p(σ 2 e) = χ 2 (σ 2 e df, S) and p(λ 2 ) = Gamma(λ 2 rate, shape). This model can be implemented using MCMC methods, for more detail see Park and Casella, 2008; de los Campos et al. (2009). j=1 The model is implemented in the package BLR. (3) SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 10/29
11 Contents Application examples Example 1: Barley dataset 1 General comments 2 LASSO 3 Application examples Example 1: Barley dataset Example 2: Wheat dataset (CIMMyT) 4 Extension of BL to include infinitesimal effect SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 11/29
12 Application examples Example 1: Barley dataset Example 1: Barley dataset This example comes from Xi and Xu (2008). DH population with n = 145 lines, each line tested in 25 environments. The response variable is grain yield. We have p = 127 MM covering 7 chromosomes. BL model fitted using the BLR package in R with B = 20, 000 iterations, burn in = 10,000, thin=10. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 12/29
13 SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 13/29 Figure 2: Point estimates for β Application examples Example 1: Barley dataset B=20,000, burnin=10,000, a=b=0.1 β j j
14 Application examples Example 1: Barley dataset β 2 β β 13 β β β Figure 3: Posterior distributions for β s SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 14/29
15 Application examples Example 1: Barley dataset β β β 95 β β Figure 4: Posterior distributions for β s SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 15/29
16 Contents Application examples Example 2: Wheat dataset (CIMMyT) 1 General comments 2 LASSO 3 Application examples Example 1: Barley dataset Example 2: Wheat dataset (CIMMyT) 4 Extension of BL to include infinitesimal effect SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 16/29
17 Application examples Example 2: Wheat dataset (CIMMyT) Example 2: Wheat dataset (CIMMyT) Data for n = 599 wheat lines evaluated in 4 environments, wheat improvement program, CIMMyT. The dataset includes p = 1279 molecular markers (x ij, i = 1,..., n, j = 1,..., p) (coded as 0,1). The pedigree information is also available. Lets load the dataset in R, 1 Load R 2 Install BGLR package (if not yet installed) 3 Load the package 4 Load the data SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 17/29
18 Continued... Application examples Example 2: Wheat dataset (CIMMyT) Lets assume that we want to predict the grain yield for environment 1 using ridge regression or equivalently the G-BLUP. We do not know the value for σ 2 e and λ, so we can obtain estimates using the data. We will use the function BGLR. R code below fit the BL model using Bayesian approach with non informative priors for σ 2 e, λ, rm(list=ls()) library(bglr) data(wheat) Y=wheat.Y X=wheat.X y=y[,1] setwd( /tmp/ ) #Linear predictor ETA=list(list(X=X,model="BL")) fml<-bglr(y=y,eta=eta,niter=10000, burnin=5000,thin=10) plot(fml$yhat,y[,1]) SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 18/29
19 Application examples Example 2: Wheat dataset (CIMMyT) Continued fml$yhat Y[, 1] Figure shows observed vs predicted grain yield. Predictions ŷ = ˆµ + X ˆβ, and estimates for σ 2 e, λ can be obtained easily in R > fml$yhat > fml$vare [1] > fml$eta[[1]]$lambda [1] SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 19/29
20 Application examples Example 2: Wheat dataset (CIMMyT) Continued Predicted Marker effects Bayesian LASSO Bayesian Ridge Regression Predicted Genetic Values Bayesian LASSO Bayesian Ridge Regression SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 20/29
21 Continued... Application examples Example 2: Wheat dataset (CIMMyT) The GEBVs can be obtained easily in R, #GEVBs #option 1 X%*%fmL$ETA[[1]]$b #option 2 fml$yhat-fml$mu Excersise: Lets assume that we want to predict the grain yield for some wheat lines. Assume that we have only the genotypic information for those lines. Write the R code for fitting a BL model. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 21/29
22 Extension of BL to include infinitesimal effect Extension of BL to include infinitesimal effect de los Campos et al. (2009) extended the basic BL model to include an infinitesimal effect, that is: y i = µ + p x ij β j + u i + e i, (4) j=1 where u N(0, σ 2 ua) and A is the pedigree matrix. The model can be implemented using Bayesian methods. SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 22/29
23 Extension of BL to include infinitesimal effect Example 3: Including an infinitesimal effect In this example we continue with the analysis of the wheat dataset, and we include an infinitesimal effect in the model. rm(list=ls()) setwd("/tmp") library(bglr) data(wheat) #Loads the wheat dataset X=wheat.X A=wheat.A Y=wheat.Y y=y[,1] #Linear predictor ETA=list(list(X=X,model="BL"), list(k=a,model="rkhs")) ### Runs the Gibbs sampler fm<-bglr(y=y,eta=eta, niter=30000,burnin=5000,thin=10) SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 23/29
24 Extension of BL to include infinitesimal effect σ e Density Iter σ e 2 σ u Density Iter σ u 2 Figure 5: Posterior distribution for σ 2 e and σ 2 u SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 24/29
25 Extension of BL to include infinitesimal effect Marker βj Figure 6: Marker effects SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 25/29
26 Extension of BL to include infinitesimal effect h Narrow sense heritability calculated according to Xi and Xu (2008), h 2 j = V j ˆβ 2 j V y, where V y is the phenotypic variance, and V j is the sample variance of x ij ; i = 1,..., n Marker Figure 7: Heritability SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 26/29
27 Extension of BL to include infinitesimal effect Phenotype Pred. Gen. Value Figure 8: Observed vs predicted values SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 27/29
28 Extension of BL to include infinitesimal effect Questions? SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 28/29
29 Extension of BL to include infinitesimal effect References Park, T. and Casella, G. (2008). The Bayesian Lasso. Journal of the American Statistical Association, 103, Yi, N. y Xu, S. (2008). Bayesian Lasso for Quantitative Trait Loci Mapping. Genetics, 179, de los Campos G., H. Naya, D. Gianola, J. Crossa, A. Legarra, E. Manfredi, K. Weigel and J. Cotes. (2009). Predicting Quantitative Traits with Regression Models for Dense Molecular Markers and Pedigree. Genetics 182: Pérez-Rodríguez P., G. de los Campos, J. Crossa and D. Gianola. (2010). Genomic-enabled prediction based on molecular markers and pedigree using the BLR package in R. The plant Genome, 3(2): SLU,Sweden GENOMIC SELECTION WORKSHOP:Hands on Practical Sessions (BL) 29/29
Prediction of genetic Values using Neural Networks
Prediction of genetic Values using Neural Networks Paulino Perez 1 Daniel Gianola 2 Jose Crossa 1 1 CIMMyT-Mexico 2 University of Wisconsin, Madison. September, 2014 SLU,Sweden Prediction of genetic Values
More informationPackage BLR. February 19, Index 9. Pedigree info for the wheat dataset
Version 1.4 Date 2014-12-03 Title Bayesian Linear Regression Package BLR February 19, 2015 Author Gustavo de los Campos, Paulino Perez Rodriguez, Maintainer Paulino Perez Rodriguez
More informationSupplementary Materials
Supplementary Materials A Prior Densities Used in the BGLR R-Package In this section we describe the prior distributions assigned to the location parameters, (β j, u l ), entering in the linear predictor
More informationPackage BGLR. R topics documented: October 2, Version 1.0. Date Title Bayesian Generalized Linear Regression
Package BGLR October 2, 2013 Version 1.0 Date 2012-09-12 Title Bayesian Generalized Linear Regression Author Gustavo de los Campos, Paulino Perez Rodriguez, Maintainer Paulino Perez Rodriguez
More informationFile S1: R Scripts used to fit models
File S1: R Scripts used to fit models This document illustrates for the wheat data set how the models were fitted in R. To begin, the required R packages are loaded as well as the wheat data from the BGLR
More informationPackage BGGE. August 10, 2018
Package BGGE August 10, 2018 Title Bayesian Genomic Linear Models Applied to GE Genome Selection Version 0.6.5 Date 2018-08-10 Description Application of genome prediction for a continuous variable, focused
More informationBGLR: A Statistical Package for Whole-Genome Regression
BGLR: A Statistical Package for Whole-Genome Regression Paulino Pérez Rodríguez Socio Economía Estadística e Informática, Colegio de Postgraduados, México perpdgo@colpos.mx Gustavo de los Campos Department
More informationBAYESIAN GENOMIC PREDICTION WITH GENOTYPE ENVIRONMENT INTERACTION KERNEL MODELS. Universidad de Quintana Roo, Chetumal, Quintana Roo, México.
G3: Genes Genomes Genetics Early Online, published on October 28, 2016 as doi:10.1534/g3.116.035584 1 BAYESIAN GENOMIC PREDICTION WITH GENOTYPE ENVIRONMENT INTERACTION KERNEL MODELS Jaime Cuevas 1, José
More informationRecent advances in statistical methods for DNA-based prediction of complex traits
Recent advances in statistical methods for DNA-based prediction of complex traits Mintu Nath Biomathematics & Statistics Scotland, Edinburgh 1 Outline Background Population genetics Animal model Methodology
More informationPackage bwgr. October 5, 2018
Type Package Title Bayesian Whole-Genome Regression Version 1.5.6 Date 2018-10-05 Package bwgr October 5, 2018 Author Alencar Xavier, William Muir, Shizhong Xu, Katy Rainey. Maintainer Alencar Xavier
More informationComputations with Markers
Computations with Markers Paulino Pérez 1 José Crossa 1 1 ColPos-México 2 CIMMyT-México June, 2015. CIMMYT, México-SAGPDB Computations with Markers 1/20 Contents 1 Genomic relationship matrix 2 3 Big Data!
More informationThreshold Models for Genome-Enabled Prediction of Ordinal Categorical Traits in Plant Breeding
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Faculty Publications, Department of Statistics Statistics, Department of 2015 Threshold Models for Genome-Enabled Prediction
More informationLecture 14: Shrinkage
Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the
More informationQuantitative genetics theory for genomic selection and efficiency of breeding value prediction in open-pollinated populations
Scientia Agricola http://dx.doi.org/0.590/003-906-04-0383 Quantitative genetics theory for genomic selection and efficiency of breeding value prediction in open-pollinated populations 43 José Marcelo Soriano
More informationarxiv: v1 [stat.me] 10 Jun 2018
Lost in translation: On the impact of data coding on penalized regression with interactions arxiv:1806.03729v1 [stat.me] 10 Jun 2018 Johannes W R Martini 1,2 Francisco Rosales 3 Ngoc-Thuy Ha 2 Thomas Kneib
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationQTL model selection: key players
Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:
More informationGWAS IV: Bayesian linear (variance component) models
GWAS IV: Bayesian linear (variance component) models Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS IV: Bayesian
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationLecture 8 Genomic Selection
Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection
More informationTHE ABILITY TO PREDICT COMPLEX TRAITS from marker data
Published November, 011 ORIGINAL RESEARCH Ridge Regression and Other Kernels for Genomic Selection with R Pacage rrblup Jeffrey B. Endelman* Abstract Many important traits in plant breeding are polygenic
More informationOne-week Course on Genetic Analysis and Plant Breeding January 2013, CIMMYT, Mexico LOD Threshold and QTL Detection Power Simulation
One-week Course on Genetic Analysis and Plant Breeding 21-2 January 213, CIMMYT, Mexico LOD Threshold and QTL Detection Power Simulation Jiankang Wang, CIMMYT China and CAAS E-mail: jkwang@cgiar.org; wangjiankang@caas.cn
More informationMultiple QTL mapping
Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power
More informationPackage LBLGXE. R topics documented: July 20, Type Package
Type Package Package LBLGXE July 20, 2015 Title Bayesian Lasso for detecting Rare (or Common) Haplotype Association and their interactions with Environmental Covariates Version 1.2 Date 2015-07-09 Author
More informationBayesian Genomic Prediction with Genotype 3 Environment Interaction Kernel Models
GENOMIC SELECTION Bayesian Genomic Prediction with Genotype 3 Environment Interaction Kernel Models Jaime Cuevas,* José Crossa,,1 Osval A. Montesinos-López, Juan Burgueño, Paulino Pérez-Rodríguez, and
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,
More informationSelection of the Bandwidth Parameter in a Bayesian Kernel Regression Model for Genomic-Enabled Prediction
Selection of the Bandwidth Parameter in a Bayesian Kernel Regression Model for Genomic-Enabled Prediction Sergio Pérez- Elizalde, Jaime Cuevas, Paulino Pérez- Rodríguez,and José Crossa One of the most
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationPedigree and genomic evaluation of pigs using a terminal cross model
66 th EAAP Annual Meeting Warsaw, Poland Pedigree and genomic evaluation of pigs using a terminal cross model Tusell, L., Gilbert, H., Riquet, J., Mercat, M.J., Legarra, A., Larzul, C. Project funded by:
More informationLecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013
Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic
More informationRegularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics
Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics arxiv:1603.08163v1 [stat.ml] 7 Mar 016 Farouk S. Nathoo, Keelin Greenlaw,
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationINTRODUCTION TO ANIMAL BREEDING. Lecture Nr 3. The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs
INTRODUCTION TO ANIMAL BREEDING Lecture Nr 3 The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs Etienne Verrier INA Paris-Grignon, Animal Sciences Department
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationQTL model selection: key players
QTL Model Selection. Bayesian strategy. Markov chain sampling 3. sampling genetic architectures 4. criteria for model selection Model Selection Seattle SISG: Yandell 0 QTL model selection: key players
More informationGenotyping strategy and reference population
GS cattle workshop Genotyping strategy and reference population Effect of size of reference group (Esa Mäntysaari, MTT) Effect of adding females to the reference population (Minna Koivula, MTT) Value of
More informationEstimation of Parameters in Random. Effect Models with Incidence Matrix. Uncertainty
Estimation of Parameters in Random Effect Models with Incidence Matrix Uncertainty Xia Shen 1,2 and Lars Rönnegård 2,3 1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden; 2 School
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationBayesian construction of perceptrons to predict phenotypes from 584K SNP data.
Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Luc Janss, Bert Kappen Radboud University Nijmegen Medical Centre Donders Institute for Neuroscience Introduction Genetic
More informationSupplement to Bayesian inference for high-dimensional linear regression under the mnet priors
The Canadian Journal of Statistics Vol. xx No. yy 0?? Pages?? La revue canadienne de statistique Supplement to Bayesian inference for high-dimensional linear regression under the mnet priors Aixin Tan
More informationHERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)
BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability
More informationarxiv: v1 [stat.me] 5 Aug 2015
Scalable Bayesian Kernel Models with Variable Selection Lorin Crawford, Kris C. Wood, and Sayan Mukherjee arxiv:1508.01217v1 [stat.me] 5 Aug 2015 Summary Nonlinear kernels are used extensively in regression
More informationBayesian Multilocus Association Models for Prediction and Mapping of Genome-Wide Data
Bayesian Multilocus Association Models for Prediction and Mapping of Genome-Wide Data DOCTORAL THESIS IN ANIMAL SCIENCE Hanni P. Kärkkäinen ACADEMIC DISSERTATION To be presented, with the permission of
More informationGenome-wide Multiple Loci Mapping in Experimental Crosses by the Iterative Adaptive Penalized Regression
Genetics: Published Articles Ahead of Print, published on February 15, 2010 as 10.1534/genetics.110.114280 Genome-wide Multiple Loci Mapping in Experimental Crosses by the Iterative Adaptive Penalized
More informationA Short Introduction to the Lasso Methodology
A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael
More informationBayesian Linear Models
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors
More informationRobust Bayesian Simple Linear Regression
Robust Bayesian Simple Linear Regression October 1, 2008 Readings: GIll 4 Robust Bayesian Simple Linear Regression p.1/11 Body Fat Data: Intervals w/ All Data 95% confidence and prediction intervals for
More informationReduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation
Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator
More information. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)
Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,
More informationA New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables
A New Bayesian Variable Selection Method: The Bayesian Lasso with Pseudo Variables Qi Tang (Joint work with Kam-Wah Tsui and Sijian Wang) Department of Statistics University of Wisconsin-Madison Feb. 8,
More informationA Review of Bayesian Variable Selection Methods: What, How and Which
Bayesian Analysis (2009) 4, Number 1, pp. 85 118 A Review of Bayesian Variable Selection Methods: What, How and Which R.B. O Hara and M. J. Sillanpää Abstract. The selection of variables in regression
More informationDOI /sagmb Statistical Applications in Genetics and Molecular Biology 2013; 12(3):
DOI 10.1515/sagmb-01-004 Statistical Alications in Genetics and Molecular Biology 013; 1(3): 375 391 Christina Lehermeier, Valentin Wimmer, Theresa Albrecht a, Hans-Jürgen Auinger, Daniel Gianola, Volker
More informationGibbs Sampling in Linear Models #2
Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling
More informationLasso & Bayesian Lasso
Readings Chapter 15 Christensen Merlise Clyde October 6, 2015 Lasso Tibshirani (JRSS B 1996) proposed estimating coefficients through L 1 constrained least squares Least Absolute Shrinkage and Selection
More informationGenome-enabled Prediction of Complex Traits with Kernel Methods: What Have We Learned?
Proceedings, 10 th World Congress of Genetics Applied to Livestock Production Genome-enabled Prediction of Complex Traits with Kernel Methods: What Have We Learned? D. Gianola 1, G. Morota 1 and J. Crossa
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationIntegrated Anlaysis of Genomics Data
Integrated Anlaysis of Genomics Data Elizabeth Jennings July 3, 01 Abstract In this project, we integrate data from several genomic platforms in a model that incorporates the biological relationships between
More informationQuantitative genetics theory for genomic selection and efficiency of genotypic value prediction in open-pollinated populations
4 Scientia Agricola http://dx.doi.org/0.590/678-99x-05-0479 Quantitative genetics theory for genomic selection and efficiency of genotypic value prediction in open-pollinated populations José Marcelo Soriano
More informationBayesian QTL mapping using skewed Student-t distributions
Genet. Sel. Evol. 34 00) 1 1 1 INRA, EDP Sciences, 00 DOI: 10.1051/gse:001001 Original article Bayesian QTL mapping using skewed Student-t distributions Peter VON ROHR a,b, Ina HOESCHELE a, a Departments
More informationMultiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar
Multiple regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Multiple regression 1 / 36 Previous two lectures Linear and logistic
More informationLarge scale genomic prediction using singular value decomposition of the genotype matrix
https://doi.org/0.86/s27-08-0373-2 Genetics Selection Evolution RESEARCH ARTICLE Open Access Large scale genomic prediction using singular value decomposition of the genotype matrix Jørgen Ødegård *, Ulf
More informationA Modern Look at Classical Multivariate Techniques
A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico
More informationMACAU 2.0 User Manual
MACAU 2.0 User Manual Shiquan Sun, Jiaqiang Zhu, and Xiang Zhou Department of Biostatistics, University of Michigan shiquans@umich.edu and xzhousph@umich.edu April 9, 2017 Copyright 2016 by Xiang Zhou
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationA Hybrid Bayesian Approach for Genome-Wide Association Studies on Related Individuals
Bioinformatics Advance Access published August 30 015 A Hybrid Bayesian Approach for Genome-Wide Association Studies on Related Individuals A. Yazdani 1 D. B. Dunson 1 Human Genetic Center University of
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationLecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013
Lecture 28: BLUP and Genomic Selection Bruce Walsh lecture notes Synbreed course version 11 July 2013 1 BLUP Selection The idea behind BLUP selection is very straightforward: An appropriate mixed-model
More informationHierarchical Generalized Linear Models for Multiple QTL Mapping
Genetics: Published Articles Ahead of Print, published on January 1, 009 as 10.1534/genetics.108.099556 Hierarchical Generalized Linear Models for Multiple QTL Mapping Nengun Yi 1,* and Samprit Baneree
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationNormal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,
Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability
More informationOverview. Background
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationIEOR165 Discussion Week 5
IEOR165 Discussion Week 5 Sheng Liu University of California, Berkeley Feb 19, 2016 Outline 1 1st Homework 2 Revisit Maximum A Posterior 3 Regularization IEOR165 Discussion Sheng Liu 2 About 1st Homework
More informationLecture 8. QTL Mapping 1: Overview and Using Inbred Lines
Lecture 8 QTL Mapping 1: Overview and Using Inbred Lines Bruce Walsh. jbwalsh@u.arizona.edu. University of Arizona. Notes from a short course taught Jan-Feb 2012 at University of Uppsala While the machinery
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationHierarchical Modeling for Spatial Data
Bayesian Spatial Modelling Spatial model specifications: P(y X, θ). Prior specifications: P(θ). Posterior inference of model parameters: P(θ y). Predictions at new locations: P(y 0 y). Model comparisons.
More information(Genome-wide) association analysis
(Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by
More informationConsistent high-dimensional Bayesian variable selection via penalized credible regions
Consistent high-dimensional Bayesian variable selection via penalized credible regions Howard Bondell bondell@stat.ncsu.edu Joint work with Brian Reich Howard Bondell p. 1 Outline High-Dimensional Variable
More informationExpression Data Exploration: Association, Patterns, Factors & Regression Modelling
Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation
More informationVariance Component Models for Quantitative Traits. Biostatistics 666
Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond
More informationShrinkage Methods: Ridge and Lasso
Shrinkage Methods: Ridge and Lasso Jonathan Hersh 1 Chapman University, Argyros School of Business hersh@chapman.edu February 27, 2019 J.Hersh (Chapman) Ridge & Lasso February 27, 2019 1 / 43 1 Intro and
More informationSTA 216, GLM, Lecture 16. October 29, 2007
STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural
More informationDay 4: Shrinkage Estimators
Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have
More informationLimited dimensionality of genomic information and effective population size
Limited dimensionality of genomic information and effective population size Ivan Pocrnić 1, D.A.L. Lourenco 1, Y. Masuda 1, A. Legarra 2 & I. Misztal 1 1 University of Georgia, USA 2 INRA, France WCGALP,
More informationCase-Control Association Testing. Case-Control Association Testing
Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies
More informationRegularization Path Algorithms for Detecting Gene Interactions
Regularization Path Algorithms for Detecting Gene Interactions Mee Young Park Trevor Hastie July 16, 2006 Abstract In this study, we consider several regularization path algorithms with grouped variable
More informationChris Fraley and Daniel Percival. August 22, 2008, revised May 14, 2010
Model-Averaged l 1 Regularization using Markov Chain Monte Carlo Model Composition Technical Report No. 541 Department of Statistics, University of Washington Chris Fraley and Daniel Percival August 22,
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationThe Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA
The Adaptive Lasso and Its Oracle Properties Hui Zou (2006), JASA Presented by Dongjun Chung March 12, 2010 Introduction Definition Oracle Properties Computations Relationship: Nonnegative Garrote Extensions:
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Chapter 6 October 18, 2016 Chapter 6 October 18, 2016 1 / 80 1 Subset selection 2 Shrinkage methods 3 Dimension reduction methods (using derived inputs) 4 High
More informationPackage brnn. R topics documented: January 26, Version 0.6 Date
Version 0.6 Date 2016-01-26 Package brnn January 26, 2016 Title Bayesian Regularization for Feed-Forward Neural Networks Author Paulino Perez Rodriguez, Daniel Gianola Maintainer Paulino Perez Rodriguez
More informationEmpirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping
Huang et al. BMC Genetics 2013, 14:5 METHODOLOGY ARTICLE Open Access Empirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping Anhui Huang 1, Shizhong Xu 2 and Xiaodong Cai 1*
More informationStatistics 203: Introduction to Regression and Analysis of Variance Penalized models
Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance
More informationHeritability estimation in modern genetics and connections to some new results for quadratic forms in statistics
Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),
More informationThe Pennsylvania State University The Graduate School THE BAYESIAN LASSO, BAYESIAN SCAD AND BAYESIAN GROUP LASSO WITH APPLICATIONS TO GENOME-WIDE
The Pennsylvania State University The Graduate School THE BAYESIAN LASSO, BAYESIAN SCAD AND BAYESIAN GROUP LASSO WITH APPLICATIONS TO GENOME-WIDE ASSOCIATION STUDIES A Dissertation in Statistics by Jiahan
More informationAn Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong
More informationModule 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline
Module 4: Bayesian Methods Lecture 9 A: Default prior selection Peter Ho Departments of Statistics and Biostatistics University of Washington Outline Je reys prior Unit information priors Empirical Bayes
More information