Régression en grande dimension et épistasie par blocs pour les études d association

Size: px
Start display at page:

Download "Régression en grande dimension et épistasie par blocs pour les études d association"

Transcription

1 Régression en grande dimension et épistasie par blocs pour les études d association V. Stanislas, C. Dalmasso, C. Ambroise Laboratoire de Mathématiques et Modélisation d Évry "Statistique et Génome" 1 / 45

2 Summary 1 GWAS and Block of linkage desequilibrium Genome Wide Association Studies Blocks of linkage desiquilibrium Hierachical Clustering with Adjacency Constraints How to improve? Some computation times 2 Epistasis The Gene-Gene Eigen Epistasis Modeling approach Simulations 3 Application Ankylosing Spondylitis First results 2 / 45

3 Sommaire 1 GWAS and Block of linkage desequilibrium Genome Wide Association Studies Blocks of linkage desiquilibrium Hierachical Clustering with Adjacency Constraints How to improve? Some computation times 2 Epistasis The Gene-Gene Eigen Epistasis Modeling approach Simulations 3 Application Ankylosing Spondylitis First results 3 / 45

4 Single-Nucleotide Polymorphism Data 90 % of human genetic variation, In human genom, SNP with allelic frequency greater than 1 % are present every 300 base pairs (in average) 2 SNP among 3 substitute cytosine with thymine Figure: SNP (wikipedia) 2 4 / 45

5 SNP Data Four possible C/T configurations X ij {0, 1, 2} (Individual i at locus j) Father Mother C T C 0 1 T 1 2 High-dimension n individuals p markers X 11 X 1p.. X n1 X np up to few million of SNPs can be genotyped! number of variables number of individuals (p n) 5 / 45

6 Genome-Wide Association Studies GWAS characteristics : Objective : find associations between genetic markers (SNP i,j {0, 1, 2}) and a phenotypic trait (Y i {0, 1} or Y i R) Genetic markers SNP http :// c Pasieka, Science Photo Library 6 / 45

7 Genome-Wide Association Studies Generalized Linear Model g(e[y i x i ]) = β 0 + p β j x ij j=1, i = 1,..., n n : number of individuals p : number of covariates Y i : response for the individual i x.j : observations for covariate j (coded in 0, 1 or 2) 7 / 45

8 Genome-Wide Association Studies SNP analysis Differences between cases and controls at a specific SNP GWAS limits : Reproductibility Heritability Data particularities : Structuration High dimension (p» n) Small effects 8 / 45

9 The LD measures Linkage Desiquilibrium non-random association of alleles at two or more loci depends on the difference between observed allelic frequencies and those expected from a independent randomly distributed model. Computation Z j the indicator of the presence of minor allele for SNP j. Z j B(p j ) D(j, k) = p jk p j p k = E[Z j Zk] p j p k = cov(z j, Z k ) r 2 (j, k) = corr(z j, Z k ) ou D (j, k) = D(j, k)/dmax(j, k) 9 / 45

10 How to estimate LD? snp vv vv VV uu a b c uu d e f UU g h i snp v V u α β U γ δ Only the genotype data table is observed α, β, γ, δ are estimated a system of equations. e.g : α = 2a + b + d + pe with p the "probability" of the haplotype (uv, UV ). estimating p, then (α, β, γ, δ) and finally D = p UV p U p V. 10 / 45

11 The LD block structure the r 2 coefficients among the 50 first SNPs of the Chromosome 22 (Dalmasso et al. 2008) LD structured in blocks Row Column Dimensions: 50 x / 45

12 Hierachical Clustering with Adjacency Constraints Cluster Dendrogram Height corr LD / 45

13 Block-Wise Approach using Linkage Disequilibrium (BALD) 1 Hierarchical clustering of the SNPs with adjacency constraint and using the LD similarity. 2 Estimation of the optimal number of groups using the Gap statistic (Tibshirani et. al., 2001). 13 / 45

14 The h-band All coefficients outside the band h are null a p h similarity matrix a hierarchical clustering with adjacency constraint 14 / 45

15 A pseudocode Data: X {0, 1, 2} n p, Sim C {C i = {X.i }, i 1,..., p} /* clusters = singletons */ ; D {1 Sim(X.i, X.(i+1) ), i 1,..., p 1} ; for step = 1 to p 1 do i argmin i {1,...,p step} D(C i, C i+1 ) ; C C \ {C i, C i +1} {C i C i +1} ; d 1 D(C i 1, C i C i +1) ; d 2 D(C i C i +1, C i +2); D D \ {D(C i 1, C i ), D(C i, C i +1)} {d 1, d 2 } ; end 15 / 45

16 The Ward s distance Ward Constrained Hierarchical Clustering d(a, B) = n An B n A + n B ( 1 S na 2 A,A + 1 S nb 2 B,B 2 ) S A,B n A n B 16 / 45

17 The pencils trick : Calculating S AA and S AB 17 / 45

18 The pencils Assessing S AA, S BB and S AB requires the calculation of sums of LD measures within pencil-shaped areas defined by : direction : right or left depth : hloc end point : lim Two arrays of sizes p h for storing the pencils sums. 18 / 45

19 The binary min-heap All nodes are either less than or equal to each of its children. Uniquely represented by storing its level order traversal in an array. Given a position i : Parent(i) = i/2 Left(i) = 2i Right(i) = 2i / 45

20 DeleteMin Time complexity : O(log(p)) 20 / 45

21 InsertHeap Time complexity : O(log(p)) 21 / 45

22 BuildHeap Data: An array A Result: A min-heap H for i = length(a)/2 down to 1 do PercolDown(A, i); end Time complexity : O(plog(p)) 22 / 45

23 Time complexity of some operations findmin insert deletemin unordered array O(p) O(1) O(p) binary heap O(1) O(log(p)) O(log(p)) 23 / 45

24 cward in seconds... t (s) e+00 1e+05 2e+05 3e+05 4e+05 5e+05 p Figure: The mean computation time t versus the number of markers p for the cward algorithm applied to randomly sampled SNP matrices. N = 100, h = 30 and t is averaged across 50 simulation runs. 24 / 45

25 Compared to a former implementation t (s) pencils -heaps +pencils +heaps p Figure: The mean computation time t versus the number of markers p for the cward algorithm and an implementation without heaps. t is averaged across 20 simulation runs. 25 / 45

26 Scalable Hierarchical Clustering with pencils and binary heap To sum up : 1 A O(p 2 ) algorithm does not scale for GWA studies. 2 The Ward distance written in a simple way. 3 Space complexity of O(ph) by using the pencils trick. 4 Time complexity of : O(ph) }{{} + O(plog(p)) }{{} 2 p h arrays of pencils sums building the heap and insert/delete heaps operations within the loop Ongoing work : Currently implemented with a genotype matrix as input. can be generalized to any band similarity matrix. 26 / 45

27 Sommaire 1 GWAS and Block of linkage desequilibrium Genome Wide Association Studies Blocks of linkage desiquilibrium Hierachical Clustering with Adjacency Constraints How to improve? Some computation times 2 Epistasis The Gene-Gene Eigen Epistasis Modeling approach Simulations 3 Application Ankylosing Spondylitis First results 27 / 45

28 Epistasis Definition Interaction of alleles effects from different markers Existing methods mainly SNP x SNP some at the block (gene) scale Advantages of gene (or block) scale approaches results biologically interpretable genetic effects may be easier to detect reduce the number of variables 28 / 45

29 Epistasis - Gene scale methods Existing gene scale methods : Two or few genes PCA + logistic regression (He et al. 2011, Li et al. 2009, Zhang et al. 2008) PLS + logistic regression (Wang T et al. 2009) For a larger number of genes PCA + LASSO (D Angelo et al. 2009) PCA + pathway-guided penalized regression (Wang X et al. 2014) 29 / 45

30 Epistasis - Gene scale methods Existing gene scale methods : Two or few genes PCA + logistic regression (He et al. 2011, Li et al. 2009, Zhang et al. 2008) PLS + logistic regression (Wang T et al. 2009) For a larger number of genes PCA + LASSO (D Angelo et al. 2009) PCA + pathway-guided penalized regression (Wang X et al. 2014) Objectives : To develop a new gene scale method considers a more accurate definition of interaction variables, is applicable with with genes, takes into account the group structure 29 / 45

31 Group modeling approach SNP 1,1.. SNP 1,p1.. SNP G,1.. SNP G,pG Pheno Ind Ind Ind i }{{} gene 1 }{{} gene G We note SNP 1,1 = X 1,1 30 / 45

32 Group modeling approach SNP 1,1.. SNP 1,p1.. SNP G,1.. SNP G,pG Pheno Ind Ind Ind i }{{} gene 1 }{{} gene G We note SNP 1,1 = X 1,1 r, s two genes 30 / 45

33 Group modeling approach Model Y i = β 0 + g X g ik ) ( β g k C }{{} Main effects + ɛ i T β = β 1,1, β 1,2,, β 1,p1,, β G,1,, β G,p }{{} G }{{} gene 1 gene G 31 / 45

34 Group modeling approach Model Y i = β 0 + g X g ik ) ( β g k C }{{} Main effects + r,s γ r,s Z r,s i }{{} Interaction effects + ɛ i T β = β 1,1, β 1,2,, β 1,p1,, β G,1,, β G,p }{{} G }{{} gene 1 gene G couple γ = γ 12,... γ 1G }{{} γ 1G,1,,γ 1G,q..., γ (G 1)G q : # of interaction variables for a 31 / 45

35 Interaction variable : Gene-Gene Eigen Epistasis (G-GEE) We consider f u (X r i, X s i ) to represent the interaction between genes r, s. Eigen Epistasis û = arg max cor(y, f u(x r, X s )) u, u =1 f u (X r, X s ) = W rs u with W rs = {X r ij X ik s,pr ;k=1,,ps }j=1 i=1 n max cor[w ˆ u, y] 2 == max u T W rs T yy T W rs u u, u =1 u, u =1 u : only eigen vector associated of W rs T yy T W rs For each couple (r, s) Z rs = W rs u 32 / 45

36 Coefficients estimation Group LASSO regression (ˆβ, ˆγ) = argmin β,γ (y i X i β Z i γ) 2 + Limits of the grouplasso regression : i ( λ pg β g 2 + ) pr p s γ rs 2 g rs Difficult to compute p-value or confidence interval 33 / 45

37 Coefficients estimation Group LASSO regression (ˆβ, ˆγ) = argmin β,γ (y i X i β Z i γ) 2 + Limits of the grouplasso regression : i ( λ pg β g 2 + ) pr p s γ rs 2 g rs Difficult to compute p-value or confidence interval Adaptive-Ridge Cleaning Becu JM, 2015 Use of a specific penalty for group LASSO Permutation test based on Fisher test approach for each group P k = 1 B #{F k F k} 33 / 45

38 Interaction variable modeling approaches comparison methods criteria Z rs T i γ rs G-GEE cor(y, f u (X r, X s )) W rs uγ rs PCA var(g r v) and var(g s v) CCA cor(g r a, G s b) PLS cov(yg r c, G s w) q j=1 q k=1 γrs jk C r j C s k q j=1 γrs j A r j B s j q j=1 γrs j T rs j 34 / 45

39 Simulation design Genotype : X i N p (0, Σ) with Σ a block diagonal correlation matrix (ρ = 0.8 for two SNPs in the same gene) MAF j U[0.05, 0.5] with MAF j = 0.2 if j causal SNP Continuous phenotype simulated under two different schemes : from Wang X et al., 2014 : ) Y i = β 0 + g PCA model : β g ( k C Y i = β 0 + g X g ik β g ( k C + rs X g ik ) γ rs + rs Xij r Xik s (j,k) C 2 + ɛ i (1) γ rs C r i1c s i1 + ɛ i. (2) 35 / 45

40 Simulations design Scenarios : We consider 600 subjects and 6 SNPs by gene First scenario on 6 genes, two settings : same genes for main and interaction effects, different genes for main and interaction effects. Second scenario on 25 genes, one setting : different genes for main and interaction effects. 36 / 45

41 Simulations results - First scenario on 6 genes Gene Pair Power Wang X et al. model Methods G-GEE PLS CCA PCA Gene Pair Power PCA model Main effects : gene 1 gene 2 Interaction effects : gene 1 x gene r r 2 Wang X et al. model PCA model Gene Pair Power r 2 Gene Pair Power r 2 Main effects : gene 1 gene 2 Interaction effects : gene 3 x gene 4 37 / 45

42 Simulations results - First scenario on 6 genes Genes1 Genes2 Genes3 Wang X et al. model Genes1 Genes2 Genes3 PCA model Genes4 Genes5 Genes6 X.Genes1.Genes2 X.Genes1.Genes3 X.Genes1.Genes4 X.Genes1.Genes5 X.Genes1.Genes6 X.Genes2.Genes3 X.Genes2.Genes4 X.Genes2.Genes5 X.Genes2.Genes6 X.Genes3.Genes4 X.Genes3.Genes5 X.Genes3.Genes6 X.Genes4.Genes5 X.Genes4.Genes6 X.Genes5.Genes6 Genes4 Genes5 Genes6 X.Genes1.Genes2 X.Genes1.Genes3 X.Genes1.Genes4 X.Genes1.Genes5 X.Genes1.Genes6 X.Genes2.Genes3 X.Genes2.Genes4 X.Genes2.Genes5 X.Genes2.Genes6 X.Genes3.Genes4 X.Genes3.Genes5 X.Genes3.Genes6 X.Genes4.Genes5 X.Genes4.Genes6 X.Genes5.Genes6 Main effects : gene 1 gene 2 Interaction effects : gene 1 x gene 2 Genes1 Genes2 G-GEE CCA PCA PLS r 2 = 0.7 Wang X et al. model Genes1 Genes2 G-GEE CCA PCA PLS r 2 = 0.7 PCA model Genes3 Genes4 Genes5 Genes6 X.Genes1.Genes2 X.Genes1.Genes3 X.Genes1.Genes4 X.Genes1.Genes5 X.Genes1.Genes6 X.Genes2.Genes3 X.Genes2.Genes4 X.Genes2.Genes5 X.Genes2.Genes6 X.Genes3.Genes4 X.Genes3.Genes5 X.Genes3.Genes6 X.Genes4.Genes5 X.Genes4.Genes6 X.Genes5.Genes6 Genes3 Genes4 Genes5 Genes6 X.Genes1.Genes2 X.Genes1.Genes3 X.Genes1.Genes4 X.Genes1.Genes5 X.Genes1.Genes6 X.Genes2.Genes3 X.Genes2.Genes4 X.Genes2.Genes5 X.Genes2.Genes6 X.Genes3.Genes4 X.Genes3.Genes5 X.Genes3.Genes6 X.Genes4.Genes5 X.Genes4.Genes6 X.Genes5.Genes6 Main effects : gene 1 gene 2 Interaction effects : gene 3 x gene 4 G-GEE CCA PCA PLS r 2 = 0.6 G-GEE CCA PCA PLS r 2 = / 45

43 Simulations results - Second scenario on 25 genes Gene1 Gene10{ Gene1.Gene2 Gene3.Gene4 Main effects : gene 1 gene 2 Gene5.Gene6 Gene7.Gene8 Gene9.Gene10 Interaction effects : gene 3 x gene 4 gene 5 x gene 6 gene 7 x gene 8 gene 9 x gene 10 G-GEE CCA PCA PLS Figure: Wang X et al. model, r 2 = / 45

44 Sommaire 1 GWAS and Block of linkage desequilibrium Genome Wide Association Studies Blocks of linkage desiquilibrium Hierachical Clustering with Adjacency Constraints How to improve? Some computation times 2 Epistasis The Gene-Gene Eigen Epistasis Modeling approach Simulations 3 Application Ankylosing Spondylitis First results 40 / 45

45 Ankylosing Spondylitis Chronic inflammatory disease of the axial skeleton Epidemiology : Age at first symptoms : years Sexe : predominance for men (sex ratio 2M :1W) Prevalence : depend of populations (0.1% - 1.4%) Right etiology unknown : Environmental factors? Genetic factors? Importance of HLA complex HLA complex : Localized on chromosome 6 Regroup about 200 genes Coding the immunity system Antigen HLA-B27 : associated to SPA Effect from other gene in HLA group? Outside HLA group? 41 / 45

46 Known genes Tsui et al., 2014 : The genetic basis of ankylosing spondylitis : new insights into disease pathogenesis, The Application of Clinical Genetics : susceptibility genes identified by GWAS 42 / 45

47 Results Significant results G-GEE HLA-B x SULT1A1 IL23R x ERAP2 PLS HLA-B EOMES x BACH2 PCA HLA-B CCA - 43 / 45

48 Conclusions and perspectives The G-GEE method : Takes into account the gene structure of data Can be applied on a large number of genes Uses a specific interaction modeling approach Ankylosing Spondylitis : Identification of potential interactions to discuss with doctors HLA-B effect Perspectives : Explore new f u (X r i, X s i ) definition Additional simulations on larger data set New applications on other data set 44 / 45

49 Thank you for your attention! 45 / 45

50 Adaptive-Ridge Cleaning λ specific penalty for group LASSO : k(j) ˆθ m k(j) m 2

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary

More information

SNP Association Studies with Case-Parent Trios

SNP Association Studies with Case-Parent Trios SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature

More information

Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials

Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials Biostatistics (2013), pp. 1 31 doi:10.1093/biostatistics/kxt006 Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials XINYI LIN, SEUNGGUEN

More information

(Genome-wide) association analysis

(Genome-wide) association analysis (Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by

More information

SNP-SNP Interactions in Case-Parent Trios

SNP-SNP Interactions in Case-Parent Trios Detection of SNP-SNP Interactions in Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 2, 2009 Karyotypes http://ghr.nlm.nih.gov/ Single Nucleotide Polymphisms

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems

More information

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) 12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April

More information

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Quantitative Genetics The analysis of traits whose variation is determined by both a

More information

Relationship between Genomic Distance-Based Regression and Kernel Machine Regression for Multi-marker Association Testing

Relationship between Genomic Distance-Based Regression and Kernel Machine Regression for Multi-marker Association Testing Relationship between Genomic Distance-Based Regression and Kernel Machine Regression for Multi-marker Association Testing Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota,

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Ruth Pfeiffer, Ph.D. Mitchell Gail Biostatistics Branch Division of Cancer Epidemiology&Genetics National

More information

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph

More information

Methods for Cryptic Structure. Methods for Cryptic Structure

Methods for Cryptic Structure. Methods for Cryptic Structure Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases

More information

BTRY 4830/6830: Quantitative Genomics and Genetics

BTRY 4830/6830: Quantitative Genomics and Genetics BTRY 4830/6830: Quantitative Genomics and Genetics Lecture 23: Alternative tests in GWAS / (Brief) Introduction to Bayesian Inference Jason Mezey jgm45@cornell.edu Nov. 13, 2014 (Th) 8:40-9:55 Announcements

More information

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies.

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies. Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

Statistical Power of Model Selection Strategies for Genome-Wide Association Studies

Statistical Power of Model Selection Strategies for Genome-Wide Association Studies Statistical Power of Model Selection Strategies for Genome-Wide Association Studies Zheyang Wu 1, Hongyu Zhao 1,2 * 1 Department of Epidemiology and Public Health, Yale University School of Medicine, New

More information

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1

27: Case study with popular GM III. 1 Introduction: Gene association mapping for complex diseases 1 10-708: Probabilistic Graphical Models, Spring 2015 27: Case study with popular GM III Lecturer: Eric P. Xing Scribes: Hyun Ah Song & Elizabeth Silver 1 Introduction: Gene association mapping for complex

More information

On coding genotypes for genetic markers with multiple alleles in genetic association study of quantitative traits

On coding genotypes for genetic markers with multiple alleles in genetic association study of quantitative traits Wang BMC Genetics 011, 1:8 http://www.biomedcentral.com/171-156/1/8 METHODOLOGY ARTICLE Open Access On coding genotypes for genetic markers with multiple alleles in genetic association study of quantitative

More information

On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease

On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease Yuehua Cui 1 and Dong-Yun Kim 2 1 Department of Statistics and Probability, Michigan State University,

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Some models of genomic selection

Some models of genomic selection Munich, December 2013 What is the talk about? Barley! Steptoe x Morex barley mapping population Steptoe x Morex barley mapping population genotyping from Close at al., 2009 and phenotyping from cite http://wheat.pw.usda.gov/ggpages/sxm/

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

p(d g A,g B )p(g B ), g B

p(d g A,g B )p(g B ), g B Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)

More information

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015 Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.

More information

Lecture 2. Basic Population and Quantitative Genetics

Lecture 2. Basic Population and Quantitative Genetics Lecture Basic Population and Quantitative Genetics Bruce Walsh. Aug 003. Nordic Summer Course Allele and Genotype Frequencies The frequency p i for allele A i is just the frequency of A i A i homozygotes

More information

Genetics Studies of Multivariate Traits

Genetics Studies of Multivariate Traits Genetics Studies of Multivariate Traits Heping Zhang Department of Epidemiology and Public Health Yale University School of Medicine Presented at Southern Regional Council on Statistics Summer Research

More information

Genetic Association Studies in the Presence of Population Structure and Admixture

Genetic Association Studies in the Presence of Population Structure and Admixture Genetic Association Studies in the Presence of Population Structure and Admixture Purushottam W. Laud and Nicholas M. Pajewski Division of Biostatistics Department of Population Health Medical College

More information

Multidimensional heritability analysis of neuroanatomical shape. Jingwei Li

Multidimensional heritability analysis of neuroanatomical shape. Jingwei Li Multidimensional heritability analysis of neuroanatomical shape Jingwei Li Brain Imaging Genetics Genetic Variation Behavior Cognition Neuroanatomy Brain Imaging Genetics Genetic Variation Neuroanatomy

More information

arxiv: v1 [stat.me] 10 Jun 2018

arxiv: v1 [stat.me] 10 Jun 2018 Lost in translation: On the impact of data coding on penalized regression with interactions arxiv:1806.03729v1 [stat.me] 10 Jun 2018 Johannes W R Martini 1,2 Francisco Rosales 3 Ngoc-Thuy Ha 2 Thomas Kneib

More information

NIH Public Access Author Manuscript Stat Sin. Author manuscript; available in PMC 2013 August 15.

NIH Public Access Author Manuscript Stat Sin. Author manuscript; available in PMC 2013 August 15. NIH Public Access Author Manuscript Published in final edited form as: Stat Sin. 2012 ; 22: 1041 1074. ON MODEL SELECTION STRATEGIES TO IDENTIFY GENES UNDERLYING BINARY TRAITS USING GENOME-WIDE ASSOCIATION

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

Lecture 11: Multiple trait models for QTL analysis

Lecture 11: Multiple trait models for QTL analysis Lecture 11: Multiple trait models for QTL analysis Julius van der Werf Multiple trait mapping of QTL...99 Increased power of QTL detection...99 Testing for linked QTL vs pleiotropic QTL...100 Multiple

More information

Penalized methods in genome-wide association studies

Penalized methods in genome-wide association studies University of Iowa Iowa Research Online Theses and Dissertations Summer 2011 Penalized methods in genome-wide association studies Jin Liu University of Iowa Copyright 2011 Jin Liu This dissertation is

More information

Asymptotic distribution of the largest eigenvalue with application to genetic data

Asymptotic distribution of the largest eigenvalue with application to genetic data Asymptotic distribution of the largest eigenvalue with application to genetic data Chong Wu University of Minnesota September 30, 2016 T32 Journal Club Chong Wu 1 / 25 Table of Contents 1 Background Gene-gene

More information

Powerful multi-locus tests for genetic association in the presence of gene-gene and gene-environment interactions

Powerful multi-locus tests for genetic association in the presence of gene-gene and gene-environment interactions Powerful multi-locus tests for genetic association in the presence of gene-gene and gene-environment interactions Nilanjan Chatterjee, Zeynep Kalaylioglu 2, Roxana Moslehi, Ulrike Peters 3, Sholom Wacholder

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs.

Nature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs. Supplementary Figure 1 Number of cases and proxy cases required to detect association at designs. = 5 10 8 for case control and proxy case control The ratio of controls to cases (or proxy cases) is 1.

More information

Variance Component Models for Quantitative Traits. Biostatistics 666

Variance Component Models for Quantitative Traits. Biostatistics 666 Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

Bi-level feature selection with applications to genetic association

Bi-level feature selection with applications to genetic association Bi-level feature selection with applications to genetic association studies October 15, 2008 Motivation In many applications, biological features possess a grouping structure Categorical variables may

More information

Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics

Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),

More information

A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction

A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction Sangseob Leem, Hye-Young Jung, Sungyoung Lee and Taesung Park Bioinformatics and Biostatistics lab

More information

Power and sample size calculations for designing rare variant sequencing association studies.

Power and sample size calculations for designing rare variant sequencing association studies. Power and sample size calculations for designing rare variant sequencing association studies. Seunggeun Lee 1, Michael C. Wu 2, Tianxi Cai 1, Yun Li 2,3, Michael Boehnke 4 and Xihong Lin 1 1 Department

More information

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables Causal inference in biomedical sciences: causal models involving genotypes Causal models for observational data Instrumental variables estimation and Mendelian randomization Krista Fischer Estonian Genome

More information

Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University, Xi'an, China;

Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University, Xi'an, China; Title: Evaluation of genetic susceptibility of common variants in CACNA1D with schizophrenia in Han Chinese Author names and affiliations: Fanglin Guan a,e, Lu Li b, Chuchu Qiao b, Gang Chen b, Tinglin

More information

Computational Approaches to Statistical Genetics

Computational Approaches to Statistical Genetics Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen

More information

Lecture 9. Short-Term Selection Response: Breeder s equation. Bruce Walsh lecture notes Synbreed course version 3 July 2013

Lecture 9. Short-Term Selection Response: Breeder s equation. Bruce Walsh lecture notes Synbreed course version 3 July 2013 Lecture 9 Short-Term Selection Response: Breeder s equation Bruce Walsh lecture notes Synbreed course version 3 July 2013 1 Response to Selection Selection can change the distribution of phenotypes, and

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April

More information

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition Lecture 3 Introduction on Quantitative Genetics: I Fisher s Variance Decomposition Bruce Walsh. Aug 004. Royal Veterinary and Agricultural University, Denmark Contribution of a Locus to the Phenotypic

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1 Principal components analysis (PCA) of all samples analyzed in the discovery phase. Colors represent the phenotype of study populations. a) The first sample

More information

Learning interactions via hierarchical group-lasso regularization

Learning interactions via hierarchical group-lasso regularization Learning interactions via hierarchical group-lasso regularization Michael Lim Trevor Hastie August 9, 013 Abstract We introduce a method for learning pairwise interactions in a linear model in a manner

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

1. Understand the methods for analyzing population structure in genomes

1. Understand the methods for analyzing population structure in genomes MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population

More information

How to analyze many contingency tables simultaneously?

How to analyze many contingency tables simultaneously? How to analyze many contingency tables simultaneously? Thorsten Dickhaus Humboldt-Universität zu Berlin Beuth Hochschule für Technik Berlin, 31.10.2012 Outline Motivation: Genetic association studies Statistical

More information

Linear Regression. Volker Tresp 2018

Linear Regression. Volker Tresp 2018 Linear Regression Volker Tresp 2018 1 Learning Machine: The Linear Model / ADALINE As with the Perceptron we start with an activation functions that is a linearly weighted sum of the inputs h = M j=0 w

More information

Heterogeneous Multitask Learning with Joint Sparsity Constraints

Heterogeneous Multitask Learning with Joint Sparsity Constraints Heterogeneous ultitas Learning with Joint Sparsity Constraints Xiaolin Yang Department of Statistics Carnegie ellon University Pittsburgh, PA 23 xyang@stat.cmu.edu Seyoung Kim achine Learning Department

More information

Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study

Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study Rui Wang, Yong Li, XiaoFeng Wang, Haixu Tang and Xiaoyong Zhou Indiana University at Bloomington

More information

arxiv: v1 [stat.ml] 13 Nov 2008

arxiv: v1 [stat.ml] 13 Nov 2008 Submitted to The American Journal of Human Genetics A Multivariate Regression Approach to Association Analysis of Quantitative Trait Network arxiv:8.226v [stat.ml] 3 Nov 28 Seyoung Kim Kyung-Ah Sohn Eric

More information

contents: BreedeR: a R-package implementing statistical models specifically suited for forest genetic resources analysts

contents: BreedeR: a R-package implementing statistical models specifically suited for forest genetic resources analysts contents: definitions components of phenotypic correlations causal components of genetic correlations pleiotropy versus LD scenarios of correlation computing genetic correlations why genetic correlations

More information

Lecture 2. Fisher s Variance Decomposition

Lecture 2. Fisher s Variance Decomposition Lecture Fisher s Variance Decomposition Bruce Walsh. June 008. Summer Institute on Statistical Genetics, Seattle Covariances and Regressions Quantitative genetics requires measures of variation and association.

More information

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu June 24, 2014 Ian Barnett

More information

Statistical Methods in Mapping Complex Diseases

Statistical Methods in Mapping Complex Diseases University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Summer 8-12-2011 Statistical Methods in Mapping Complex Diseases Jing He University of Pennsylvania, jinghe@mail.med.upenn.edu

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Expression Data Exploration: Association, Patterns, Factors & Regression Modelling Exploring gene expression data Scale factors, median chip correlation on gene subsets for crude data quality investigation

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

The universal validity of the possible triangle constraint for Affected-Sib-Pairs

The universal validity of the possible triangle constraint for Affected-Sib-Pairs The Canadian Journal of Statistics Vol. 31, No.?, 2003, Pages???-??? La revue canadienne de statistique The universal validity of the possible triangle constraint for Affected-Sib-Pairs Zeny Z. Feng, Jiahua

More information

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012 Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012 Last Time Sequence data and quantification of variation Infinite sites model Nucleotide diversity (π) Sequence-based

More information

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization

More information

Latent Variable Methods for the Analysis of Genomic Data

Latent Variable Methods for the Analysis of Genomic Data John D. Storey Center for Statistics and Machine Learning & Lewis-Sigler Institute for Integrative Genomics Latent Variable Methods for the Analysis of Genomic Data http://genomine.org/talks/ Data m variables

More information

Genetics Studies of Comorbidity

Genetics Studies of Comorbidity Genetics Studies of Comorbidity Heping Zhang Department of Epidemiology and Public Health Yale University School of Medicine Presented at Science at the Edge Michigan State University January 27, 2012

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

Penalized Methods for Multiple Outcome Data in Genome-Wide Association Studies

Penalized Methods for Multiple Outcome Data in Genome-Wide Association Studies Penalized Methods for Multiple Outcome Data in Genome-Wide Association Studies Jin Liu 1, Shuangge Ma 1, and Jian Huang 2 1 Division of Biostatistics, School of Public Health, Yale University 2 Department

More information

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS. !! www.clutchprep.com CONCEPT: HISTORY OF GENETICS The earliest use of genetics was through of plants and animals (8000-1000 B.C.) Selective breeding (artificial selection) is the process of breeding organisms

More information

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies

The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies The Generalized Higher Criticism for Testing SNP-sets in Genetic Association Studies Ian Barnett, Rajarshi Mukherjee & Xihong Lin Harvard University ibarnett@hsph.harvard.edu August 5, 2014 Ian Barnett

More information

Genetic Studies of Multivariate Traits

Genetic Studies of Multivariate Traits Genetic Studies of Multivariate Traits Heping Zhang Collaborative Center for Statistics in Science Yale University School of Public Health Presented at DIMACS, University of Rutgers May 17, 2013 Heping

More information

QTL model selection: key players

QTL model selection: key players Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:

More information

Linkage and Linkage Disequilibrium

Linkage and Linkage Disequilibrium Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies

More information

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q) Supplementary information S7 Testing for association at imputed SPs puted SPs Score tests A Score Test needs calculations of the observed data score and information matrix only under the null hypothesis,

More information

Lesson 4: Understanding Genetics

Lesson 4: Understanding Genetics Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous

More information

Inferring Causal Phenotype Networks from Segregating Populat

Inferring Causal Phenotype Networks from Segregating Populat Inferring Causal Phenotype Networks from Segregating Populations Elias Chaibub Neto chaibub@stat.wisc.edu Statistics Department, University of Wisconsin - Madison July 15, 2008 Overview Introduction Description

More information

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test The Harvard community has made this article openly available Please share how this access

More information

BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE INTERACTION STUDIES

BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE INTERACTION STUDIES Texas Medical Center Library DigitalCommons@TMC UT GSBS Dissertations and Theses (Open Access) Graduate School of Biomedical Sciences 8-2013 BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE

More information

A Short Introduction to the Lasso Methodology

A Short Introduction to the Lasso Methodology A Short Introduction to the Lasso Methodology Michael Gutmann sites.google.com/site/michaelgutmann University of Helsinki Aalto University Helsinki Institute for Information Technology March 9, 2016 Michael

More information

A multivariate regression approach to association analysis of a quantitative trait network

A multivariate regression approach to association analysis of a quantitative trait network BIOINFORMATICS Vol. 25 ISMB 2009, pages i204 i212 doi:10.1093/bioinformatics/btp218 A multivariate regression approach to association analysis of a quantitative trait networ Seyoung Kim, Kyung-Ah Sohn

More information

Evolution of phenotypic traits

Evolution of phenotypic traits Quantitative genetics Evolution of phenotypic traits Very few phenotypic traits are controlled by one locus, as in our previous discussion of genetics and evolution Quantitative genetics considers characters

More information