Combining dependent tests for linkage or association across multiple phenotypic traits
|
|
- Rosemary Christal Crawford
- 6 years ago
- Views:
Transcription
1 Biostatistics (2003), 4, 2,pp Printed in Great Britain Combining dependent tests for linkage or association across multiple phenotypic traits XIN XU Program for Population Genetics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA LU TIAN, L. J. WEI Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA SUMMARY A robust statistical method to detect linkage or association between a genetic marker and a set of distinct phenotypic traits is to combine univariate trait-specific test statistics for a more powerful overall test. This procedure does not need complex modeling assumptions, can easily handle the problem with partially missing trait values, and is applicable to the case with a mixture of qualitative and quantitative traits. In this note, we propose a simple test procedure along this line, and show its advantages over the standard combination tests for linkage or association in the literature through a data set from Genetic Analysis Workshop 12 (GAW12) and an extensive simulation study. Keywords: Combining tests; Linkage analysis; Random effects models; Tests for association. 1. INTRODUCTION For studies of linkage and allelic association between phenotypic traits and genotypic markers, often several distinct traits, which may be closely related to a complex disease of interest, are available from each study subject. The key question is how to utilize such multivariate data efficiently in the analysis. Three different statistical approaches have been taken to handle this problem in the literature. The first one is to use the so-called random effects models to deal with the correlation among the phenotypic variables (see, for example, Laird and Ware, 1982; Korol et al., 1995; Jiang and Zeng, 1995; William, et al., 1999; Hackett, et al., 2001; Iturria and Blangero, 2000, Calinski et al., 2000; McCulloch and Searle, 2001). This novel approach, however, heavily depends on the model assumption, and does not have an efficient way to handle the case when the trait values are partially missing. The second approach is to create a single phenotypic variable with a linear combination of all the trait values from each subject, and then perform standard univariate analyses (Amos et al., 1990; Amos and Laing, 1993). To locate the optimal linear combination, however, one needs to use rather computing-intensive grids search methods. It is difficult if not impossible, to obtain the bona fide p-value for the resulting optimal test for testing linkage or association. Moreover, like the previous one, this approach may not be able to handle missing To whom correspondence should be addressed c Oxford University Press (2003)
2 224 XIN XU ETAL. observations efficiently either. The third approach is to combine individual test statistics or estimators obtained from the trait-specific univariate analysis for a global assessment on linkage or association (O Brien, 1984; Wei and Johnson, 1985; Liang and Zeger, 1986; Xu et al., 2001). The resulting procedures are nonparametric, which do not depend on the complex modeling. Furthermore, they can easily handle the problem when some trait values are missing completely at random and the case with a mixture of quantitative and qualitative traits. In this note, we consider the third approach for combining information across the multivariate traits, and present a quite simple modification of the existing test procedures for linkage or association. We used a data set from GAW12 to illustrate the new proposal. We also conducted an extensive numerical study to demonstrate that the modification greatly improves the power for detecting linkage or association between the phenotypic and genotypic markers over the conventional linear combination procedures. 2. COMBINING TEST STATISTICS Let T = (T 1,...,T K ) be a vector of K possibly correlated statistics, and each of them is obtained from a trait-specific univariate analysis. For example, T k is a test statistic for linkage or association solely based on the data from the kth trait, k = 1,...,K. For all the cases we encountered in practice, T is asymptotically normal with mean β = (β 1,...,β K ) and known (or consistently estimated) covariance matrix Σ. Suppose that we are interested in testing the hypothesis H 0 : β = 0, against a general alternative hypothesis H 1 : β k 0, k = 1,...,K, and there is at least one β>0. For this one-sided alternative, one may consider a class of linear combinations a T as test statistics, where a = (a 1,...,a K ) is a vector of possibly data-dependent weights. If β 1 = =β K, the linear combination with a = Σ 1 e is the most powerful test for testing H 0 against H 1, where e = (1,...,1) (O Brien, 1984; Wei and Johnson, 1985). The corresponding test statistic is e Σ 1 T. (1) A large value of (1) indicates that H 0 is not correctly specified. When T 1,...,T K are mutually independent, (1) simply n k=1 σk 2 T k, a well known procedure to combine information across K independent studies in meta analysis, where σ k is the standard error of T k. The test based on (1), however, may have low power against the alternatives that β are not clustered together. Now, consider the above testing problem for a simple alternative hypothesis that β = β 0 = (β 10,,β K 0 ), agiven vector in H 1. It is straightforward to show that the weight a of the best linear combination of T s for testing H 0 against this simple alternative hypothesis is Σ 1 β 0. The resulting test statistic is β 0 Σ 1 T. (2) Since we are interested in a test which is powerful against any β in H 1 (not just against a specific β 0 ) and T is expected to be a good estimate for β, it seems natural to replace β in (2) with T. This gives us a test statistic T 1 T, (3) which is chi-square distributed with K degrees of freedom under H 0.Although this test is omnibus with respect to H 1,itisnot very powerful against specific alternatives due to the heavy tail of the chi-square distribution. The problem with replacing β in (2) with T is that under the null hypothesis H 0, T converges to β = 0, therefore, the test statistic (3) is no longer a linear combination of T. The optimal test statistic (2) for testing against a general β can be rewritten as ( β1,..., β ) K Ɣ 1 Z, (4) σ 1 σ K
3 Combining dependent tests for linkage or association across multiple phenotypic traits 225 where Z = (Z 1,...,Z K ), Z k = T k /σ k, and Ɣ is the corresponding correlation matrix of. Ideally, we would like to have a test statistic which, under H 0, behaves like a linear combination of Z, but under the alternative H 1, whose kth weight is β k /σ k. This motivates us to consider the following simple test statistics: W (c) = (Z 1c,...,Z Kc )Ɣ 1 Z, (5) where Z kc = max{z k, c}, k = 1,...,K, and c is some given non-negative constant. For example, when c = 2, under H 0, Z k is the standard normal, therefore Z kc 2, W (c) is approximately a linear combination of Z. Onthe other hand, for a relatively large β k > 0 under the alternative hypothesis, Z kc is most likely to be Z k β k /σ k. Note that under H 0, the distribution of W (0) has a rather long tail, and the corresponding test performs like the omnibus test (3). On the other hand, W (4) e Σ 1 T, and its operating characteristics are similar to those of test (1). Therefore, there is no single c which would make the test W (c) optimal for a broad class of alternatives. Here, we present a simple procedure for testing H 0 against H 1, which chooses c in (5) automatically and objectively from a reasonably large interval [0,τ]. Note that W (c) is not normally distributed under H 0, but its null distribution can be estimated easily by generating M (a large number) independent Z = (Z 1,...,Z K ) from normal with mean 0 and variance covariance matrix Γ. For each realized Z, we compute W (c) as a function of c. This establishes a reference set D consisting of M realized {W (c); 0 c τ} under the null hypothesis. Now, let w(c) be the value of W (c) for the observed data, then its p-value p(c) = pr(w (c) >w(c)) can be estimated based on the reference set D. Letp m = min {0 c τ} p(c). A small p m suggests a rejection of H 0. Note that we essentially choose the test which gives the smallest p-value among all the tests {W (c), 0 c τ}. Naturally, p m is not the correct p-value for such a test. To establish the cut-off points of our test procedure, let P(c) and P m be the random counterparts of p(c) and p m, respectively. The null distribution of P m can be estimated by generating N (a large number) fresh independent Z from N(0, Γ). For each realized Z, we compute {W (c), 0 c τ}, and use the reference set D to figure out the corresponding P(c) and P m. The null distribution of P m can be estimated using those N realizations, and the bona f ide p-value pr(p m < p m ) of the test can then be estimated accordingly. Note that there is a parameter τ in the above proposal. Through an extensive numerical study, we find that the operating characteristics of the test are quite stable with τ 4. In practice, we recommend the test with τ = EXAMPLE Let us use an example of a linkage study to illustrate the above test procedure. Suppose that we are interested in detecting a linkage between a genetic marker and a complex disease trait with K intermediate phenotypic variables from each study subject. Haseman and Elston (1972) proposed a simple linear regression model for testing linkage with sib-pair observations and a single phenotypic variable per subject. Specifically, for the kth trait and a typical sib-pair, let X k and π be the squared trait difference and the mean genetic sharing identical by descent at the marker, respectively, k = 1,...,K. Then E(X k ) = α 1k β k π, (6) where E(X) is the expected value of X. Alarge value of the estimate for β k suggests a linkage between the kth trait and the marker. Drigalenko (1998) and Elston et al. (2000) suggested replacing the response variable in the Haseman and Elston model with the squared mean-corrected trait sum Y k for the sib-pair. That is, E(Y k ) = α 2k β k π. (7)
4 226 XIN XU ETAL. Table 1. Relative genetic variance components of traits Q1-5 for GAW12 simulated data set Q1 Q2 Q3 Q4 Q5 MG MG MG MG MG Table 2. p-values for univariate and combination tests at MG1 5 with GAW12 data Q1 Q2 Q3 Q4 Q5 SLC NEW MG MG MG MG MG Standard linear combination test. Note that these two models share the same regression coefficient β k. Recently, Xu et al. (2000) used the idea of Wei and Johnson (1985) and proposed a simple, unified estimation procedure for β k by linearly combining the regression coefficient estimates for the above two models. Let the resulting estimator be denoted by T k, k = 1,...,K. The covariance matrix estimate Σ for T can be obtained easily through some elementary probability arguments (Xu et al., 2001). Now, we apply our new test procedure for linkage to Problem 2 of GAW12 (Wijsman et al., 2001). The data for this problem were simulated for a common oligogenic disease with five intermediate quantitative traits Q1 5, which were generated through a random effects model with various combinations of five specific genetic loci MG1 5, and two environmental factors, gender and age. The relative contributions of MG1 5tothe total variance of Q1 5 are given in Table 1. Note that there is no single marker which regulates all five traits. For this simulated GAW study, there are 50 replicates of 23 large pedigrees from a general population. For illustration of our proposal, we randomly selected 500 independent sibling pairs from the above population. Prior to our linkage analysis using models (6) and (7), the trait values Q1 5 were adjusted with the two environmental factors, age and gender. Moreover, each adjusted trait value was standardized by subtracting off its empirical mean and dividing by its sample standard deviation. In Table 2, we report p-values for all the univariate tests based on T k for linkage (see Xu et al. (2000)), the standard linear combination (SLC) test based on (1) (O Brien, 1984; Wei and Johnson, 1985; Xu et al., 2001) and our new test at these five genetic loci. Compared with the SLC test, the new test gives substantially smaller p-values for all the markers MG1 5. Moreover, for MG3 which regulates three phenotypic traits, the p- value of the new test is uniformly smaller than those based on the univariate tests. All the p-values in this example are estimated based on M = N = 10 6 realizations with τ = 4. Each p m was obtained by minimizing p(c) over the interval [0, 4] by an increment of OPERATING CHARACTERISTICS FOR THE NEW TEST We conducted an extensive numerical study to examine if our proposal is more powerful than the most commonly used SLC test (1). To include a broad class of alternatives in our comparisons, we let β be a
5 Combining dependent tests for linkage or association across multiple phenotypic traits 227 Fig. 1. Scatter diagrams of 500 pairs of log-transformed p-values for the standard linear combination (SLC) and the new tests. (a) Correlation η = 0.3; (b) η = 0.7. vector of independent copies of the uniform random variable defined on the interval (0, 3).For each given β, wegenerate an observed vector T of K test statistics with a given covariance matrix Σ. The reference set D and the estimated null distribution of P m are based on M = N = 10 6 independent Z. The p-value of the SLC test is computed from the normal approximation. For all the cases we studied, the new test is almost uniformly better than its standard counterpart. In Figure 1, we present results for two scenarios with K = 5 and Σ being a correlation matrix, which has equal correlation η. The horizontal line of the plot is the log-transformed p-value for the SLC test, and the vertical line is the counterpart for the new test. Each dot in the figure represents a pair of log-transformed p-values for a specific alternative β generated as described above. Most dots are below the 45 line, indicating that the new test is more powerful than the standard one. For many cases, the improvements are quite substantial. For cases that the new test is not better than the standard one with respect to their p-values, the corresponding dots in the figure are quite close to the 45 line, indicating that there is no practical difference between these two tests. 5. REMARKS The proposed test procedure is an effective screening device for linkage and/or association studies in the presence of multiple traits, which are known to relate to a complex trait of interest. For a given genetic marker, it is very likely that not every trait is related to the marker. Our new test procedure objectively and automatically puts more weights on those traits, which have large observed correlations with the marker of interest. On the other hand, the standard linear combination test advocated by O Brien (1984) and Wei and Johnson (1985) does not utilize such valuable information for an overall evaluation for linkage or association. Our test can be generalized to the case with two-sided alternative hypotheses. For example, one possible modification is to replace Z k and Ɣ in (5) by Z k and the identity matrix, respectively: this results
6 228 XIN XU ETAL. in a test statistic k max{ Z k, c} Z k. When c = 0in(5) the resulting test is very similar to a chi-square test proposed by Lange et al. (2002) for the transmission/disequilibrium test with multiple phenotypic variables. Specifically, they considered the standard quadratic form T Σ 1 T as the test statistic. This Wald-type test is omnibus and is designed for testing against a general two-sided alternative. ACKNOWLEDGEMENTS We are grateful to an Associate Editor for insightful comments on the paper. GAW12 is supported by grant GM REFERENCES AMOS, C. I., ELSTON, R. C., BONNEY, G. E., KEATS, B. J. AND BERENSON, G.(1990). A multivariate method for detecting genetic linkage, with application to a pedigree with an adverse lipoprotein phenotype. American Journal of Human Genetics 47, AMOS, C. I. AND LAING, A. E.(1993). A comparison of univariate and multivariate tests for genetic linkage. Genetic Epidemiology 10, CALINSKI, T., KACZMAREK, Z., KRAJEWSKI, P., FROVA, C. AND SARI-GORLA, M.(2000). A multivariate approach to the problem of QTL localization. Heredity 84, DRIGALENKO, E.(1998). How sib pairs reveal linkage. American Journal Human Genetics 63, ELSTON, R.C., BUXBAUM, S., JACOBS, K.B.AND OLSON, J.M.(2000). Haseman and Elston revisited. Genetic Epidemiology 19, HACKETT, C. A., MEYER, R. C. AND THOMAS, W. T.(2001). Multi-trait QTL mapping in barley using multivariate regression. Genetics Research 77, HASEMAN, J.K.AND ELSTON, R.C.(1972). The investigation of linkage between a quantitative trait and a marker locus. Behavior Genetics 2, ITURRIA, S. J. AND BLANGERO, J.(2000). An EM algorithm for obtaining maximum likelihood estimates in the multi-phenotype variance components linkage model. Annuals of Human Genetics 64, JIANG, C. AND ZENG, Z. B.(1995). Multiple trait analysis of genetic mapping for quantitative trait loci. Genetics 140, KOROL, A. B., RONIN, Y. I. AND KIRZHNER, V. M.(1995). Interval mapping of quantitative trait loci employing correlated trait complexes. Genetics 140, LAIRD, N.M.AND WARE, J.H.(1982). Random-effects models for longitudinal data. Biometrics 38, LANGE, C., SILVERMAN, E., WEISS, S., XU, X. AND LAIRD, N. M. (2002). A multivariate transmission disequilibrium test. Biostatisticsin press. LIANG, K.Y.AND ZEGER, S.L.(1986). Longitudinal data analysis using generalized linear model. Biometrika 73, MCCULLOCH, C.E.AND SEARLE, S.R.(2001). Generalized, Linear, and Mixed Model. New York: Wiley. O BRIEN, P.C.(1984). Procedures for comparing samples with multiple endpoints. Biometrics 40, WEI, L. J. AND JOHNSON, W. E.(1985). Combining dependent tests with incomplete repeated measurements. Biometrika 72, WIJSMAN, E. M. et al. (2001). Analysis of complex genetic traits: Applications to asthma and simulated. Genetic Epidemiology Supp. 1, S1 S853.
7 Combining dependent tests for linkage or association across multiple phenotypic traits 229 WILLIAMS, J. T., VAN EERDEWEGH, P., ALMASY, L. AND BLANGERO, J.(1999). Joint multipoint linkage analysis of multivariate qualitative and quantitative traits. I. Likelihood formulation and simulation results. American Journal of Human Genetics 65, XU, X.,PALMER, L.J., HORVATH, S.AND WEI, L.J.(2001). Combining multiple phenotypic traits optimally for detecting linkage with sib-pair observations. Genetic Epidemiology Supp. 1, S148 S153. XU, X., WEISS, S., XU, X. AND WEI, L. J.(2000). A unified Haseman Elston method for testing linkage with quantitative traits. American Journal of Human Genetics 67, [Received March 5, 2002; first revision March 28, 2002; second revision June 18, 2002; accepted for publication June 23, 2002]
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by
More informationPower and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees
Johns Hopkins University, Dept. of Biostatistics Working Papers 1-5-2004 Power and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees Weimin Chen Johns Hopkins Bloomberg School of
More informationAnalytic power calculation for QTL linkage analysis of small pedigrees
(2001) 9, 335 ± 340 ã 2001 Nature Publishing Group All rights reserved 1018-4813/01 $15.00 www.nature.com/ejhg ARTICLE for QTL linkage analysis of small pedigrees FruÈhling V Rijsdijk*,1, John K Hewitt
More informationI Have the Power in QTL linkage: single and multilocus analysis
I Have the Power in QTL linkage: single and multilocus analysis Benjamin Neale 1, Sir Shaun Purcell 2 & Pak Sham 13 1 SGDP, IoP, London, UK 2 Harvard School of Public Health, Cambridge, MA, USA 3 Department
More informationLecture 9. QTL Mapping 2: Outbred Populations
Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More informationLecture 11: Multiple trait models for QTL analysis
Lecture 11: Multiple trait models for QTL analysis Julius van der Werf Multiple trait mapping of QTL...99 Increased power of QTL detection...99 Testing for linked QTL vs pleiotropic QTL...100 Multiple
More informationVariance Component Models for Quantitative Traits. Biostatistics 666
Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond
More informationPower and Design Considerations for a General Class of Family-Based Association Tests: Quantitative Traits
Am. J. Hum. Genet. 71:1330 1341, 00 Power and Design Considerations for a General Class of Family-Based Association Tests: Quantitative Traits Christoph Lange, 1 Dawn L. DeMeo, and Nan M. Laird 1 1 Department
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationDNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to
1 1 1 1 1 1 1 1 0 SUPPLEMENTARY MATERIALS, B. BIVARIATE PEDIGREE-BASED ASSOCIATION ANALYSIS Introduction We propose here a statistical method of bivariate genetic analysis, designed to evaluate contribution
More informationMapping quantitative trait loci in oligogenic models
Biostatistics (2001), 2, 2,pp. 147 162 Printed in Great Britain Mapping quantitative trait loci in oligogenic models HSIU-KHUERN TANG, D. SIEGMUND Department of Statistics, 390 Serra Mall, Sequoia Hall,
More informationAsymptotic properties of the likelihood ratio test statistics with the possible triangle constraint in Affected-Sib-Pair analysis
The Canadian Journal of Statistics Vol.?, No.?, 2006, Pages???-??? La revue canadienne de statistique Asymptotic properties of the likelihood ratio test statistics with the possible triangle constraint
More informationAnalysis of Incomplete Non-Normal Longitudinal Lipid Data
Analysis of Incomplete Non-Normal Longitudinal Lipid Data Jiajun Liu*, Devan V. Mehrotra, Xiaoming Li, and Kaifeng Lu 2 Merck Research Laboratories, PA/NJ 2 Forrest Laboratories, NY *jiajun_liu@merck.com
More information. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)
Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationAsymptotic distribution of the largest eigenvalue with application to genetic data
Asymptotic distribution of the largest eigenvalue with application to genetic data Chong Wu University of Minnesota September 30, 2016 T32 Journal Club Chong Wu 1 / 25 Table of Contents 1 Background Gene-gene
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationUnivariate Linkage in Mx. Boulder, TC 18, March 2005 Posthuma, Maes, Neale
Univariate Linkage in Mx Boulder, TC 18, March 2005 Posthuma, Maes, Neale VC analysis of Linkage Incorporating IBD Coefficients Covariance might differ according to sharing at a particular locus. Sharing
More informationThe Quantitative TDT
The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus
More informationVARIANCE-COMPONENTS (VC) linkage analysis
Copyright Ó 2006 by the Genetics Society of America DOI: 10.1534/genetics.105.054650 Quantitative Trait Linkage Analysis Using Gaussian Copulas Mingyao Li,*,,1 Michael Boehnke, Goncxalo R. Abecasis and
More informationSTT 843 Key to Homework 1 Spring 2018
STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ
More informationThe Admixture Model in Linkage Analysis
The Admixture Model in Linkage Analysis Jie Peng D. Siegmund Department of Statistics, Stanford University, Stanford, CA 94305 SUMMARY We study an appropriate version of the score statistic to test the
More informationReview Article Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies
Probability and Statistics Volume 2012, Article ID 652569, 13 pages doi:10.1155/2012/652569 Review Article Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies Qiong Yang 1 and
More informationSupplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control
Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model
More informationAffected Sibling Pairs. Biostatistics 666
Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD
More informationLinkage Disequilibrium Mapping of Quantitative Trait Loci by Selective Genotyping
1 Linkage Disequilibrium Mapping of Quantitative Trait Loci by Selective Genotyping Running title: LD mapping of QTL by selective genotyping Zehua Chen 1, Gang Zheng 2,KaushikGhosh 3 and Zhaohai Li 3,4
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large
More informationQTL-by-Environment Interaction
1 QTL-by-Environment Interaction 1. The problem Differential expression of a phenotypic trait by genotypes across environments, or Genotype x Environment (GxE) interaction is an old problem of primary
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationQTL mapping under ascertainment
QTL mapping under ascertainment J. PENG Department of Statistics, University of California, Davis, CA 95616 D. SIEGMUND Department of Statistics, Stanford University, Stanford, CA 94305 February 15, 2006
More informationA Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases
Original Paper Hum Hered 001;51:64 78 Received: May 1, 1999 Revision received: September 10, 1999 Accepted: October 6, 1999 A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping
More informationRobust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study
Science Journal of Applied Mathematics and Statistics 2014; 2(1): 20-25 Published online February 20, 2014 (http://www.sciencepublishinggroup.com/j/sjams) doi: 10.11648/j.sjams.20140201.13 Robust covariance
More information2. Map genetic distance between markers
Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,
More informationCharles E. McCulloch Biometrics Unit and Statistics Center Cornell University
A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationBinomial Mixture Model-based Association Tests under Genetic Heterogeneity
Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,
More informationThe universal validity of the possible triangle constraint for Affected-Sib-Pairs
The Canadian Journal of Statistics Vol. 31, No.?, 2003, Pages???-??? La revue canadienne de statistique The universal validity of the possible triangle constraint for Affected-Sib-Pairs Zeny Z. Feng, Jiahua
More informationPrediction of the Confidence Interval of Quantitative Trait Loci Location
Behavior Genetics, Vol. 34, No. 4, July 2004 ( 2004) Prediction of the Confidence Interval of Quantitative Trait Loci Location Peter M. Visscher 1,3 and Mike E. Goddard 2 Received 4 Sept. 2003 Final 28
More informationConfidence intervals for the variance component of random-effects linear models
The Stata Journal (2004) 4, Number 4, pp. 429 435 Confidence intervals for the variance component of random-effects linear models Matteo Bottai Arnold School of Public Health University of South Carolina
More informationHarvard University. Harvard University Biostatistics Working Paper Series
Harvard University Harvard University Biostatistics Working Paper Series Year 2008 Paper 94 The Highest Confidence Density Region and Its Usage for Inferences about the Survival Function with Censored
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationPowerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees
Am. J. Hum. Genet. 71:38 53, 00 Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees Pak C. Sham, 1 Shaun Purcell, 1 Stacey S. Cherny, 1, and Gonçalo R. Abecasis 3 1 Institute
More informationMantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC
Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)
More informationMIXED MODELS THE GENERAL MIXED MODEL
MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted
More informationTOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING
TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING by Chia-Ling Kuo MS, Biostatstics, National Taiwan University, Taipei, Taiwan, 003 BBA, Statistics, National Chengchi University, Taipei, Taiwan, 001
More informationOverview. Background
Overview Implementation of robust methods for locating quantitative trait loci in R Introduction to QTL mapping Andreas Baierl and Andreas Futschik Institute of Statistics and Decision Support Systems
More informationMethods for Cryptic Structure. Methods for Cryptic Structure
Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More informationSample Size and Power Considerations for Longitudinal Studies
Sample Size and Power Considerations for Longitudinal Studies Outline Quantities required to determine the sample size in longitudinal studies Review of type I error, type II error, and power For continuous
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April
More informationIndependent Increments in Group Sequential Tests: A Review
Independent Increments in Group Sequential Tests: A Review KyungMann Kim kmkim@biostat.wisc.edu University of Wisconsin-Madison, Madison, WI, USA July 13, 2013 Outline Early Sequential Analysis Independent
More informationA new simple method for improving QTL mapping under selective genotyping
Genetics: Early Online, published on September 22, 2014 as 10.1534/genetics.114.168385 A new simple method for improving QTL mapping under selective genotyping Hsin-I Lee a, Hsiang-An Ho a and Chen-Hung
More informationMultipoint Quantitative-Trait Linkage Analysis in General Pedigrees
Am. J. Hum. Genet. 6:9, 99 Multipoint Quantitative-Trait Linkage Analysis in General Pedigrees Laura Almasy and John Blangero Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature
More informationSemiparametric Mixed Effects Models with Flexible Random Effects Distribution
Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,
More informationAn Approximate Test for Homogeneity of Correlated Correlation Coefficients
Quality & Quantity 37: 99 110, 2003. 2003 Kluwer Academic Publishers. Printed in the Netherlands. 99 Research Note An Approximate Test for Homogeneity of Correlated Correlation Coefficients TRIVELLORE
More informationPower and sample size calculations for designing rare variant sequencing association studies.
Power and sample size calculations for designing rare variant sequencing association studies. Seunggeun Lee 1, Michael C. Wu 2, Tianxi Cai 1, Yun Li 2,3, Michael Boehnke 4 and Xihong Lin 1 1 Department
More informationChapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments
Chapter 2. Review of basic Statistical methods 1 Distribution, conditional distribution and moments We consider two kinds of random variables: discrete and continuous random variables. For discrete random
More informationA Robust Test for Two-Stage Design in Genome-Wide Association Studies
Biometrics Supplementary Materials A Robust Test for Two-Stage Design in Genome-Wide Association Studies Minjung Kwak, Jungnam Joo and Gang Zheng Appendix A: Calculations of the thresholds D 1 and D The
More informationRepeated ordinal measurements: a generalised estimating equation approach
Repeated ordinal measurements: a generalised estimating equation approach David Clayton MRC Biostatistics Unit 5, Shaftesbury Road Cambridge CB2 2BW April 7, 1992 Abstract Cumulative logit and related
More informationParametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1
Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson
More informationMore Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction
Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order
More informationSample size calculations for logistic and Poisson regression models
Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National
More informationTutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:
Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis Anna E. Barón, Keith E. Muller, Sarah M. Kreidler, and Deborah H. Glueck Acknowledgements: The project was supported
More informationThe equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model
Applied and Computational Mathematics 2014; 3(5): 268-272 Published online November 10, 2014 (http://www.sciencepublishinggroup.com/j/acm) doi: 10.11648/j.acm.20140305.22 ISSN: 2328-5605 (Print); ISSN:
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationHeritability estimation in modern genetics and connections to some new results for quadratic forms in statistics
Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),
More informationTHE data in the QTL mapping study are usually composed. A New Simple Method for Improving QTL Mapping Under Selective Genotyping INVESTIGATION
INVESTIGATION A New Simple Method for Improving QTL Mapping Under Selective Genotyping Hsin-I Lee,* Hsiang-An Ho,* and Chen-Hung Kao*,,1 *Institute of Statistical Science, Academia Sinica, Taipei 11529,
More informationMARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES
REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of
More informationLikelihood ratio testing for zero variance components in linear mixed models
Likelihood ratio testing for zero variance components in linear mixed models Sonja Greven 1,3, Ciprian Crainiceanu 2, Annette Peters 3 and Helmut Küchenhoff 1 1 Department of Statistics, LMU Munich University,
More informationMarginal Screening and Post-Selection Inference
Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More informationCase-Control Association Testing. Case-Control Association Testing
Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies
More informationGOODNESS-OF-FIT FOR GEE: AN EXAMPLE WITH MENTAL HEALTH SERVICE UTILIZATION
STATISTICS IN MEDICINE GOODNESS-OF-FIT FOR GEE: AN EXAMPLE WITH MENTAL HEALTH SERVICE UTILIZATION NICHOLAS J. HORTON*, JUDITH D. BEBCHUK, CHERYL L. JONES, STUART R. LIPSITZ, PAUL J. CATALANO, GWENDOLYN
More informationSTAT 536: Genetic Statistics
STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,
More informationA Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data
A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine
More informationExpression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia
Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.
More informationTesting Homogeneity Of A Large Data Set By Bootstrapping
Testing Homogeneity Of A Large Data Set By Bootstrapping 1 Morimune, K and 2 Hoshino, Y 1 Graduate School of Economics, Kyoto University Yoshida Honcho Sakyo Kyoto 606-8501, Japan. E-Mail: morimune@econ.kyoto-u.ac.jp
More informationTesting for Homogeneity in Genetic Linkage Analysis
Testing for Homogeneity in Genetic Linkage Analysis Yuejiao Fu, 1, Jiahua Chen 2 and John D. Kalbfleisch 3 1 Department of Mathematics and Statistics, York University Toronto, ON, M3J 1P3, Canada 2 Department
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationShu Yang and Jae Kwang Kim. Harvard University and Iowa State University
Statistica Sinica 27 (2017), 000-000 doi:https://doi.org/10.5705/ss.202016.0155 DISCUSSION: DISSECTING MULTIPLE IMPUTATION FROM A MULTI-PHASE INFERENCE PERSPECTIVE: WHAT HAPPENS WHEN GOD S, IMPUTER S AND
More information25 : Graphical induced structured input/output models
10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph
More informationModeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17
Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared
More informationAnalysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington
Analysis of Longitudinal Data Patrick J Heagerty PhD Department of Biostatistics University of Washington Auckland 8 Session One Outline Examples of longitudinal data Scientific motivation Opportunities
More informationOptimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives
Genetic Epidemiology 16:225 249 (1999) Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives Mary Sara McPeek* Department of Statistics, University of Chicago, Chicago, Illinois
More informationApproximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions
Approximate Test for Comparing Parameters of Several Inverse Hypergeometric Distributions Lei Zhang 1, Hongmei Han 2, Dachuan Zhang 3, and William D. Johnson 2 1. Mississippi State Department of Health,
More informationBiometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.
Biometrika Trust An Improved Bonferroni Procedure for Multiple Tests of Significance Author(s): R. J. Simes Source: Biometrika, Vol. 73, No. 3 (Dec., 1986), pp. 751-754 Published by: Biometrika Trust Stable
More informationBTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014
BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y
More informationApplication of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption
Application of Parametric Homogeneity of Variances Tests under Violation of Classical Assumption Alisa A. Gorbunova and Boris Yu. Lemeshko Novosibirsk State Technical University Department of Applied Mathematics,
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationA note on profile likelihood for exponential tilt mixture models
Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential
More informationPrevious lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)
Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value
More informationA COMPARISON OF PRINCIPLE COMPONENT ANALYSIS AND FACTOR ANALYSIS FOR QUANTITATIVE PHENOTYPES ON FAMILY DATA. Xiaojing Wang
A COMPARISON OF PRINCIPLE COMPONENT ANALYSIS AND FACTOR ANALYSIS FOR QUANTITATIVE PHENOTYPES ON FAMILY DATA by Xiaojing Wang BS, Huanzhong University of Science and Technology, China 1998 MS, College of
More informationMajor Genes, Polygenes, and
Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More information