The joint posterior distribution of the unknown parameters and hidden variables, given the

Size: px

Start display at page:

Download "The joint posterior distribution of the unknown parameters and hidden variables, given the"

Patience Cross
5 years ago
Views:

1 DERIVATIONS OF THE FULLY CONDITIONAL POSTERIOR DENSITIES The joint posterior distribution of the unknown parameters and hidden variables, given the data, is proportional to the product of the joint prior and the likelihood, and the fully conditional posteriors of the parameters can be easily determined by selecting the terms including the parameter in question from the joint posterior. For simplicity: the data, and the parameters except the one in question The fully conditional posterior distribution of the population intercept β 0 is proportional to the likelihood, since the prior of β 0 is proportional to one, so n pβ 0 exp y i β 0 γ j β j x ij u i. S This is a product of n kernels of normal distributions, with a common variance σ 0, and means y i p γ jβ j x ij u i, i,..., n. The set of Gaussian functions is closed under multiplication, i.e. the product of normal densities is also a normal density, with the mean and the variance of a product density given by µi µ σ i σ i and σ σ i respectively. Hence, since in this case variance of the factors in the product is constant, the mean of product density reduces to the sum of the means of the individual distributions, divided by n, while the variance of the product density is given simply by the variance of the individual distributions divided by n. The fully conditional posterior distribution of the regression coefficients β j is proportional to the product of the likelihood and the conditional prior pβ j σ j, pβ j n exp y i β 0 l m γ l β lm x ilm u i exp β j, S that is a product of two types of kernels of normal distributions. One distribution comes from the prior, it has mean 0 and variance σ j. The other part comes from the likelihood H. P. Kärkkäinen and M. J. Sillanpää SI

2 regarding β j there are n kernels of normal distributions with means y i β 0 γ j x ij l,m j,k γ l β lm x ilm u i, i,..., n, and variances σ 0/γ j x ij, i,..., n. Hence we get as a product a normal distribution with a mean γj x ij y i β 0 σ 0 l,m j,k / γ l β lm x ilm u i γ i x ij σ 0 + σ j and variance γ i x ij σ 0 + σ j, which equals the variance given in the Appendix when each term is multiplied by σ 0. The conditional posterior distribution of the polygenic effect u is multivariate normal, as it is a product of the multivariate normal prior distribution, u σ u N n 0, Aσ u and a normal likelihood, pu pu σupy exp n u Aσu u exp y i β 0 l m γ l β lm x ilm u i. The kernel of the likelihood part of the posterior can also be interpreted as a kernel of a multivariate normal density regarding the polygenic effect u N n y β 0 XΓβ, I n σ 0, so can simply use the same multiplication rule of normal densities as in previous cases, and we therefore get conditional posterior mean I σ0 n + A y β σu σ0 0 XΓβ S3 and covariance I σ0 n +. A σu H. P. Kärkkäinen and M. J. Sillanpää 3 SI

3 The variance parameters of the model have inverse χ -priors which, due to conjugacy, leads to inverse χ -posteriors. The prior of the residual variance σ 0 is proportional to /σ 0, which leads to a posterior density pσ 0 σ 0 + n exp σ 0 y i β 0 γ j β j x ij u i. S4 Regarding σ 0, this is an unnormalized probability density function of an inverse χ -distribution, with n degrees of freedom and scale parameter equal to n y i β 0 γ j β j x ij u i. The inverse χ -posteriors of the other variance parameters, σ j and σ u, are derived with an identical logic. In case of the Laplace0, λ prior for the effect size, the effect variance σ j Expλ /, and hence the fully conditional posterior of σ j is proportional to the product of the exponential prior and the normal, conditional prior of the effect size β j σ j, pσ j pβ j σ j pσ j λ / exp β j λ exp λ. Since exponential density is not conjugate to normal density, we need to consider the inverse of the variance p σ j / exp β j λ σ j, where the last term is the Jacobian of the transformation σ j terms, we get p λ β j + σ j 3/ λ and completing the numerator of the exponent into square p σ j 3/ λ β j σ βj λ j + λ λ σ j. By rearranging the β j λ H. P. Kärkkäinen and M. J. Sillanpää 4 SI

4 σ j is canceled out from the the last term of the exponent, hence the term being constant and left out, after which the exponent is expanded by β j /λ and we get p λ σ σ j 3/ j λ σ 3/ j σ j σ j β j λ λ β j λ β j λ β j, + λ β j S5 that is an inverse-gaussian probability density function with mean µ and shape λ µ λ β j and λ λ, the parametrization of the inverse-gaussian density being fx x 3/ exp λ x µ. µ x The fully conditional posterior distribution of the indicators γ j is Bernoulli. Directly from Bayes formula we get pγ j y, pγ jpy γ j, py γj πrj π πr j π π + πr j π + πr j where π pγ j pγ j 0, and R j py γ j, py γ j 0, exp σ 0 pγ j py γ j, pγ j 0py γ j 0, + pγ j py γ j, γj π π + πr j exp y i β 0 γ h β h x ih β j x ij h j exp y i β 0 γ h β h x ih h j k β j x ij y i β 0 γ h β h x ih β j x ij. h j γj, S6 H. P. Kärkkäinen and M. J. Sillanpää 5 SI

5 Hierarchical Laplace Non-hierarchical Laplace with indicator no indicator with indicator no indicator nqtl ξ λ Original Mean Max Min Std π Original Mean Max Min Std Table S: Estimated hyperparameter values of λ and π in the original QTL-MAS data, in addition to the mean, maximum, minimum and standard deviance of the estimates in the analyses of the 00 replicated data sets. The nqtl and ξ denote alternative hyperprior parameter values for the prior of the indicator π Betanqtl, p nqtl and the rate parameter λ Gamma, ξ under the hierarchical Laplace model and λ Gamma, ξ under the non-hierarchical Laplace model, with and without the indicator variable. denotes the optional Beta, prior of the indicator. H. P. Kärkkäinen and M. J. Sillanpää 6 SI

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose