SNP-SNP Interactions in Case-Parent Trios
|
|
- Polly Webster
- 5 years ago
- Views:
Transcription
1 Detection of SNP-SNP Interactions in Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 2, 2009 Karyotypes
2 Single Nucleotide Polymphisms urgi.versailles.inra.fr Haplotype Blocks Kim Dionne (2007)
3 SNP-SNP Interactions Many statistical approaches have been used in the literature f the detection of SNP-SNP interactions: Regression with Higher Order Interactions Logic (Boolean) Regression Monte Carlo Logic Regression Multifact Dimensionality Reduction (MDR) Classification Regression Trees (CART) Rom Fests Boosting Multivariate Adaptive Regression Splines (MARS) Neural Netwks References: Heidema AG et al (2006). The Challenge f Genetic Epidemiologists: How to Analyze... BMC Genetics 7: 23. McKinney BA et al (2006). Machine Learning f Detecting Gene-Gene Interactions... Appl Bioinf 5(2): Musani SK (2007). Detection of Gene x Gene Interactions in... Hum Hered 63(2): Biological Statistical Interactions BB Bb bb AA Aa aa (SNPA D SNPB R ) (SNPB D SNPA R )
4 Biological Statistical Interactions BB Bb bb AA Aa aa (SNPA D SNPB R ) (SNPB D SNPA R ) Statistical interaction: Deviation from additivity in a linear statistical model. Epistasis: Masking of phenotype expressed by one gene by the effects of another gene. Reference: Moe JH (2005). A Global View of Epistasis. Nature Genetics, 37: Introduction Current methods f analyzing complex traits include analyzing localizing disease loci one at a time. However, complex traits can be caused by the interaction of many loci, each with varying effect....patterns of interactions between several loci, f example, disease phenotype caused by locus A locus B, A but not B, A (B C), clearly make identification of the involved loci me difficult. While the simultaneous analysis of every single two-way pair of markers can be feasible, it becomes overwhelmingly computationally burdensome to analyze all 3-way, 4-way to N-way patterns, patterns, combinations of loci. Reference: Lucek PR, Ott J (1997). Neural Netwk Analysis of Complex Traits. Genet Epi 14(6):
5 Introduction Assume we have a cidate gene study with 1000 cases of a certain disease, 1000 controls. We typed 100 SNPs of interest, we were lucky - the two causal SNPs were typed. Also assume that a double penetrance model describes the relationship between SNPs phenotype: subjects with two variant alleles of either SNP 24 SNP 80 at least one variant allele of the other have higher odds of disease: logit(p) =α + β Ind{(SNP24 D SNP80 R ) (SNP24 R SNP80 D )} How could we detect this signal? Introduction !2!4!6 SNP1 SNP2 SNP3 SNP4 SNP5 SNP6 SNP7 SNP8 SNP9 SNP10 SNP11 SNP12 SNP13 SNP14 SNP15 SNP16 SNP17 SNP18 SNP19 SNP20 SNP21 SNP22 SNP23 SNP24 SNP25 SNP26 SNP27 SNP28 SNP29 SNP30 SNP31 SNP32 SNP33 SNP34 SNP35 SNP36 SNP37 SNP38 SNP39 SNP40 SNP41 SNP42 SNP43 SNP44 SNP45 SNP46 SNP47 SNP48 SNP49 SNP50 SNP51 SNP52 SNP53 SNP54 SNP55 SNP56 SNP57 SNP58 SNP59 SNP60 SNP61 SNP62 SNP63 SNP64 SNP65 SNP66 SNP67 SNP68 SNP69 SNP70 SNP71 SNP72 SNP73 SNP74 SNP75 SNP76 SNP77 SNP78 SNP79 SNP80 SNP81 SNP82 SNP83 SNP84 SNP85 SNP86 SNP87 SNP88 SNP89 SNP90 SNP91 SNP92 SNP93 SNP94 SNP95 SNP96 SNP97 SNP98 SNP99 SNP100
6 Introduction !2!4!6 SNP1 SNP2 SNP3 SNP4 SNP5 SNP6 SNP7 SNP8 SNP9 SNP10 SNP11 SNP12 SNP13 SNP14 SNP15 SNP16 SNP17 SNP18 SNP19 SNP20 SNP21 SNP22 SNP23 SNP24 SNP25 SNP26 SNP27 SNP28 SNP29 SNP30 SNP31 SNP32 SNP33 SNP34 SNP35 SNP36 SNP37 SNP38 SNP39 SNP40 SNP41 SNP42 SNP43 SNP44 SNP45 SNP46 SNP47 SNP48 SNP49 SNP50 SNP51 SNP52 SNP53 SNP54 SNP55 SNP56 SNP57 SNP58 SNP59 SNP60 SNP61 SNP62 SNP63 SNP64 SNP65 SNP66 SNP67 SNP68 SNP69 SNP70 SNP71 SNP72 SNP73 SNP74 SNP75 SNP76 SNP77 SNP78 SNP79 SNP80 SNP81 SNP82 SNP83 SNP84 SNP85 SNP86 SNP87 SNP88 SNP89 SNP90 SNP91 SNP92 SNP93 SNP94 SNP95 SNP96 SNP97 SNP98 SNP99 SNP100 Logic Regression The predicts are the SNPs in dominant recessive coding. SNP X X.R X.D AA 0 0 AT 0 1 TT 1 1 Let X 1,...,X k be binary (0/1) predicts, Y a response variable. Fit a model g(e(y)) = b 0 + t b j L j, j=1 where L j is a Boolean combination of the covariates, e.g. L j = (X 1 X 2 ) X c 4. Determine the logic terms L j estimate the b j simultaneously. Reference: Ruczinski et al (2003). Logic Regression. J of Comp Graph Stat, 12(3):
7 Logic Regression Alternate Leaf Alternate Operat Grow Branch Possible Moves Initial Tree Prune Branch Split Leaf Delete Leaf Logic Regression We try to fit the model g(e(y)) = b 0 + t j=1 b j L j. Select a scing function (RSS, log-likelihood,...). Pick the maximum number of Logic Trees. Pick the maximum number of leaves in a tree. Initialize the model with L j = 0 f all j. Carry out the Simulated Annealing Algithm: Propose a move. Accept reject the move depending on the sces the temperature.
8 Logic Regression Logic Regression SNP80.R
9 Logic Regression SNP80.R SNP24.R SNP80.R Logic Regression SNP80.R SNP24.R SNP80.R SNP80.R SNP24.R SNP80.D
10 Logic Regression SNP80.R SNP24.R SNP80.R SNP80.R SNP24.R SNP80.D SNP80.R SNP24.R SNP80.D SNP73.R Logic Regression SNP80.R SNP24.R SNP80.R SNP80.R SNP24.R SNP80.D SNP80.R SNP24.R SNP86.R SNP30.R SNP80.D SNP73.R SNP80.R SNP24.R SNP80.D
11 Logic Regression We implemented two flavs f the required model selection. Both approaches require a definition of model size. 1 Cross-validation This is most applicable when prediction is the main objective, i. e. not necessarily the best option in SNP association studies. 2 Permutation tests This is a test f association, i. e. the preferred test in SNP association studies. The model size is chosen via a sequence of hypothesis tests. Schizophrenia Diagnosis classification of Schizophrenia (SZ) depends on the observation of disease related symptoms: delusion, hallucination, bizarre behavi, disganized speech,... About 1% prevalence wldwide, equal in males females. Schizoaffective disder (SZA) is closely related to SZ. Over 2000 papers on linkage association studies. About 150 genes implicated, most on chromosomes 1, 6, 22. Most findings are somewhat inconclusive, though. Most studies are single marker analysis only, very few address SNP-SNP interactions.
12 Schizophrenia Study Case-parent trios of Ashkenazi Jewish descents. Diagnosis of SZ SZA based on DSM IV. Dense coverage of 64 cidate genes. 375 SNPs on 11 chromosomes genotyped f 312 trios. Original analysis through single marker TDT. Goal: Exple SNP-SNP interactions f association with SZ SZA. Reference: Fallin et al (2005). Bipolar I disder schizophrenia... Am J Hum Genet 77(6): TDT - Allelic The transmission disequilibrium test measures the over-transmission of an allele from parents to affected offsprings. F a set of n parents with alleles 1 2 at a genetic locus, each parent can be summarized by the transmitted the non-transmitted allele: Non-TA P a b a + b TA 2 c d c + d P a + c b + d 2n Under the null of no association, Only the heterozygous parents contribute infmation! (b c)2 b + c χ 2 1
13 TDT - Genotypic Assume that at a certain locus the father has alleles 11 the mother has alleles 12. The four Mendelian children thus have alleles 11, 12, 11, 12. Assume the affected prob has genotype 11. The three Pseudo controls then have the genotypes 11, 12, 12. Y X Affected prob 1 11 Pseudo control # Pseudo control # Pseudo control # We can use coonditional logistic regression to analyze the data! Outline 1 We extended the logic regression methodology to analyze trios with affected probs. 2 We developed an R package to a) impute missing genotypes, using a haplotype-based approach that employs mating tables, to b) simulate case-parent trios with disease risk dependent on SNP-SNP interactions. 3 We carried out simulation studies to verify the validity of the methodology the software. 4 We analyzed the data from the study of Schizophrenia among Ashkenazi Jewish families.
14 Trio Logic Regression The rough idea is as follows: Pseudo controls are generated from the trio data, taking the LD block structure into account. Missing data are hled using haplotype-based imputation. The conditional logistic likelihood is used in logic regression to assess differences in cases pseudo controls. Trio Logic Regression
15 Trio Logic Regression The steps in me detail: 1 Estimate the haplotype blocks the haplotype frequencies using the parents genotypes. 2 F each block each trio, sample haplotype pairs f the parents the offspring consistent with the observed genotypes in the trio, allowing f missing data. 3 Generate the probs genotype data from the haplotypes that were passed from the parents. 4 F each block each trio, generate genotypes f three pseudo-controls (PC1, PC2, PC3) using the parents haplotypes that were not passed to the prob. The assignment to PC1, PC2, PC3 is rom. 5 Assemble three pseudo-controls f each trio by augmenting the genotypes from the blocks. 6 F each locus, translate the genotype data into two binary variables in dominant recessive coding. Trio Logic Regression The conditional logistic likelihood is L(β) = ( ) N exp(x i0 β) i=1 exp(x i0 β)+σ 3 m=1 exp(x imβ) In this setting, i refers to a trio, X i0 refers to the exposure of the prob in trio i, (X i1, X i2, X i3 ) are the exposures of the 3 pseudo-controls in trio i. Does the likelihood always have a maximum?
16 Trio Logic Regression Number of trios where k pseudo-controls match the prob k X i0 = 0 a 0 b 0 c 0 d 0 X i0 = 1 a 1 b 1 c 1 d 1 Likelihood contributions f X i0 = 0 a 0 b 0 c 0 d 0 X i X i X i X i L i (β) exp(β) exp(β) 1 3+exp(β) Likelihood contributions f X i0 = 1 a 1 b 1 c 1 d 1 X i X i X i X i L i (β) exp(β) exp(β)+3 exp(β) 2 exp(β)+2 exp(β) 3 exp(β) Trio Logic Regression The log-likelihood is: log(l(β)) = a 1 {β log(exp(β)+3)} + b 1 {β log(2 exp(β)+2)} + c 1 {β log(3 exp(β)+1)} + d 1 4 a 0 log{1 + 3 exp(β)} b 0 log{2 + 2 exp(β)} c 0 log{3 + exp(β)} d 0 4
17 Trio Logic Regression The first derivative of the log-likelihood is: log(l(β)) β = a 1 (a 1 + c 0 ) exp(β) exp(β)+3 + b 1 (b 1 + b 0 ) c 1 (c 1 + a 0 ) 2 exp(β) 2 exp(β) exp(β) 3 exp(β)+1 The log-likelihood is monotonically decreasing in β. Further: lim β log(l(β)) β = a 1 + b 1 + c 1 0 lim β + log(l(β)) β = (a 0 + b 0 + c 0 ) 0 Trio Logic Regression Since the derivative of the log likelihood function is a continuous function in β, it follows that the likelihood function has a maximum unless a 1 + b 1 + c 1 = 0 a 0 + b 0 + c 0 = 0. In other wds, f either the exposure the non-exposure groups, all pseudo-controls are equal to the respective probs. This might occur in a particular sample where the number of trios is small the ( some of the) SNPs contributing to the exposure definition have extremely low min allele frequency, however, it is not plausible that such a separation would exist in the population in truth.
18 Missing Data Simulation G 1 SNP R SNP R 4 2 SNPR (SNP R 1 1 SNPD 6 2 ) SNPR (SNP R 2 3 SNPD 5 2 ) SNPR (SNP D 4 1 SNPR 8 2 ) (SNPR 5 1 SNPD 6 1 ) 6 ((SNP R 4 2 SNPR 7 2 ) SNPR 8 3 ) (SNPR 9 2 SNPR 6 1 ) 7 (SNP R 3 11 SNPR 12 2 ) (SNPR 5 4 SNPR 15 1 ) (SNPR 9 2 SNPR 8 3 ) P(G) Haplotypes / block # of haplotypes # mating table rows , , 3, , 5, , 5, 5, , 8, 5, 3, 3 2, , 4, 5, 5, 3, 5 4,
19 Simulation L1/11 L2/22 L1/11 Haplotype Frequencies population parents prob Haplotype Frequencies Simulation When using conditional logistic regression to compare cases pseudo-controls, the expected value of the parameter estimates is not the logs odds ratio β, but the log relative risk (Schaid 1996). RR = P(D I G = 1) P(D I G = 0) = exp(α + β)/(1 + exp(α + β)) exp(α)/(1 + exp(α)) = exp(β) ( ) exp(α + β) 1 + exp(α) Therefe log(rr) = β log ( ) 1 + exp(α + β) 1 + exp(α)
20 Simulation log(rr) =β log! 1 + exp(α + β) 1 + exp(α) 4 3 2!^ 1 0 "!!5!3!1!5!3!1!5!3!1!5!3!1!5!3! Simulation log(rr) =β log! 1 + exp(α + β) 1 + exp(α) 4 3 2!^ 1 0 "!!5!3!1!5!3!1!5!3!1!5!3!1!5!3!
21 Software 1 The trio logic regression methods are implemented as an augmentation in the logic regression R package LogicReg. 2 The R package trio contains functions to generate logic regression input from pedigree genotype files, to check f Mendelian errs, to impute missing data, to simulate case-parent trios. 3 A software vignette is also available. Results Logic model exp( ˆβ) I n o 302 D I n o 302 D 166 D I n o 302 D 166 D 148 D I n o 302 D 166 D 148 D 368 R 3.65 SNP 302 Chromosome 12 NOS SNP 166 Chromosome 8 CHRNB SNP 148 Chromosome 8 PNOC SNP 368 Chromosome 22 COMT
22 Results Logic model exp( ˆβ) I n o 302 D I n o 302 D 166 D I n o 302 D 166 D 148 D I n o 302 D 166 D 148 D 368 R 3.65 SNP 302 Chromosome 12 NOS SNP 166 Chromosome 8 CHRNB SNP 148 Chromosome 8 PNOC SNP 368 Chromosome 22 COMT Results exp( ˆβ) ˆβ (se) z p Marginal 302 D (0.16) e D (0.22) Logic 302 D 166 D (0.18) e-06 Additive 302 D (0.16) e D (0.22) D (0.19) e-06 Additive 166 D (0.33) D : 166 D (0.34)
23 Results exp( ˆβ) ˆβ (se) z p Marginal 302 D (0.16) e D (0.22) Logic 302 D 166 D (0.18) e-06 Additive 302 D (0.16) e D (0.22) D (0.19) e-06 Additive 166 D (0.33) D : 166 D (0.34) Results exp( ˆβ) ˆβ (se) z p Marginal 302 D (0.16) e D (0.22) Logic 302 D 166 D (0.18) e-06 Additive 302 D (0.16) e D (0.22) D (0.19) e-06 Additive 166 D (0.33) D : 166 D (0.34)
24 Results 0.67 I n 302 Do I {166D } 302 D 302 D D main effects only Results 0.90 I n 302 Do I {166D } 1.09 In 302 D :166 Do 302 D 302 D D main effects + interaction
25 Results 0.89 I n 302 D 166 Do 302 D 302 D D logic regression Results 302 D 302 D 302 D 302 D 302 D 302 D D 166 D 166 D main effects only main effects + interaction logic regression
26 Results exp( ˆβ) ˆβ(se) z p 302 D (0.16) e-05 Marginal 166 D (0.22) D (0.25) Logic 302 D 166 D 148 D (0.20) e-08 Acknowledgments Methods: Dani Fallin, Qing Li, Tom Louis. Data: Ann Pulver. Computing suppt: Marvin Newhouse, Jiong Yang. Funding: NIH R01 DK061662, GM083084, HL090577, a CTSA grant to the Johns Hopkins Medical Institutions.
27 iruczins/
SNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature
More informationSNP Association Studies with Case-Parent Trios
SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health July, 00 Acknowledgments Collaborators: Qing Li, Rob Scharpf, Holger Schwender,
More informationMissing Data. On Missing Data and Interactions in SNP Association Studies. Missing Data - Approaches. Missing Data - Approaches. Multiple Imputation
October, 6 @ U of British Columbia Missing Data On Missing Data Interactions in SNP Association Studies 5 SNPs Ingo Ruczinski Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
More informationSome New Methods for Family-Based Association Studies
Some New Methods for Family-Based Association Studies Ingo Ruczinski Department of Biostatistics Johns Hopkins Bloomberg School of Public Health April 8, 20 http: //biostat.jhsph.edu/ iruczins/ Topics
More informationPreparing Case-Parent Trio Data and Detecting Disease-Associated SNPs, SNP Interactions, and Gene-Environment Interactions with trio
Preparing Case-Parent Trio Data and Detecting Disease-Associated SNPs, SNP Interactions, and Gene-Environment Interactions with trio Holger Schwender, Qing Li, and Ingo Ruczinski Contents 1 Introduction
More informationThe Quantitative TDT
The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationBayesian Inference of Interactions and Associations
Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationLecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015
Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationFriday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo
Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationSolutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin
Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele
More information1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:
.5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the
More informationMODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES
MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by
More informationCase-Control Association Testing. Case-Control Association Testing
Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies
More informationGenetics (patterns of inheritance)
MENDELIAN GENETICS branch of biology that studies how genetic characteristics are inherited MENDELIAN GENETICS Gregory Mendel, an Augustinian monk (1822-1884), was the first who systematically studied
More informationThe E-M Algorithm in Genetics. Biostatistics 666 Lecture 8
The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as
More information2. Map genetic distance between markers
Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationLecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017
Lecture 7: Interaction Analysis Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 39 Lecture Outline Beyond main SNP effects Introduction to Concept of Statistical Interaction
More informationFor 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.
STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).
More informationOn the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease
On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease Yuehua Cui 1 and Dong-Yun Kim 2 1 Department of Statistics and Probability, Michigan State University,
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationPopulation Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda
1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;
More informationA novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction
A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction Sangseob Leem, Hye-Young Jung, Sungyoung Lee and Taesung Park Bioinformatics and Biostatistics lab
More informationComplexity and Approximation of the Minimum Recombination Haplotype Configuration Problem
Complexity and Approximation of the Minimum Recombination Haplotype Configuration Problem Lan Liu 1, Xi Chen 3, Jing Xiao 3, and Tao Jiang 1,2 1 Department of Computer Science and Engineering, University
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationModeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17
Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared
More informationDNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to
1 1 1 1 1 1 1 1 0 SUPPLEMENTARY MATERIALS, B. BIVARIATE PEDIGREE-BASED ASSOCIATION ANALYSIS Introduction We propose here a statistical method of bivariate genetic analysis, designed to evaluate contribution
More informationp(d g A,g B )p(g B ), g B
Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)
More informationTutorial Session 2. MCMC for the analysis of genetic data on pedigrees:
MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation
More informationRégression en grande dimension et épistasie par blocs pour les études d association
Régression en grande dimension et épistasie par blocs pour les études d association V. Stanislas, C. Dalmasso, C. Ambroise Laboratoire de Mathématiques et Modélisation d Évry "Statistique et Génome" 1
More informationLecture 9. QTL Mapping 2: Outbred Populations
Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred
More informationAffected Sibling Pairs. Biostatistics 666
Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD
More informationOutline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white
Outline - segregation of alleles in single trait crosses - independent assortment of alleles - using probability to predict outcomes - statistical analysis of hypotheses - conditional probability in multi-generation
More informationLesson 4: Understanding Genetics
Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous
More informationLogic Regression. Ingo Ruczinski. Department of Biostatistics Johns Hopkins University.
Logic Regression Ingo Ruczinski Department of Biostatistics Jons Hopkins University Email: ingo@ju.edu ttp://biosun.biostat.jsp.edu/ iruczins Wit Carles Kooperberg Micael LeBlanc, FHCRC Introduction Motivation
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationMultiple QTL mapping
Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power
More informationLecture 11: Multiple trait models for QTL analysis
Lecture 11: Multiple trait models for QTL analysis Julius van der Werf Multiple trait mapping of QTL...99 Increased power of QTL detection...99 Testing for linked QTL vs pleiotropic QTL...100 Multiple
More informationOutline for today s lecture (Ch. 14, Part I)
Outline for today s lecture (Ch. 14, Part I) Ploidy vs. DNA content The basis of heredity ca. 1850s Mendel s Experiments and Theory Law of Segregation Law of Independent Assortment Introduction to Probability
More informationBIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS
016064 BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS TEST CODE: 016064 Directions: Each of the questions or incomplete statements below is followed by five suggested answers or completions.
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationUNIT 8 BIOLOGY: Meiosis and Heredity Page 148
UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular
More informationLecture 9. Short-Term Selection Response: Breeder s equation. Bruce Walsh lecture notes Synbreed course version 3 July 2013
Lecture 9 Short-Term Selection Response: Breeder s equation Bruce Walsh lecture notes Synbreed course version 3 July 2013 1 Response to Selection Selection can change the distribution of phenotypes, and
More informationLinkage and Linkage Disequilibrium
Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More informationStatistical issues in QTL mapping in mice
Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping
More informationGenetic Association Studies in the Presence of Population Structure and Admixture
Genetic Association Studies in the Presence of Population Structure and Admixture Purushottam W. Laud and Nicholas M. Pajewski Division of Biostatistics Department of Population Health Medical College
More informationMapping multiple QTL in experimental crosses
Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationThe universal validity of the possible triangle constraint for Affected-Sib-Pairs
The Canadian Journal of Statistics Vol. 31, No.?, 2003, Pages???-??? La revue canadienne de statistique The universal validity of the possible triangle constraint for Affected-Sib-Pairs Zeny Z. Feng, Jiahua
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]
More informationGenetics Studies of Multivariate Traits
Genetics Studies of Multivariate Traits Heping Zhang Department of Epidemiology and Public Health Yale University School of Medicine Presented at Southern Regional Council on Statistics Summer Research
More informationMCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES
MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES Elizabeth A. Thompson Department of Statistics, University of Washington Box 354322, Seattle, WA 98195-4322, USA Email: thompson@stat.washington.edu This
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April
More informationGuided Notes Unit 6: Classical Genetics
Name: Date: Block: Chapter 6: Meiosis and Mendel I. Concept 6.1: Chromosomes and Meiosis Guided Notes Unit 6: Classical Genetics a. Meiosis: i. (In animals, meiosis occurs in the sex organs the testes
More informationTheoretical and computational aspects of association tests: application in case-control genome-wide association studies.
Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus
More informationGene mapping in model organisms
Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary
More informationChapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)
12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²
More informationA General and Accurate Approach for Computing the Statistical Power of the Transmission Disequilibrium Test for Complex Disease Genes
Genetic Epidemiology 2:53 67 (200) A General and Accurate Approach for Computing the Statistical Power of the Transmission Disequilibrium Test for Complex Disease Genes Wei-Min Chen and Hong-Wen Deng,2
More informationThe Lander-Green Algorithm. Biostatistics 666 Lecture 22
The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals
More informationComputational Systems Biology: Biology X
Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA
More informationPatterns of inheritance
Patterns of inheritance Learning goals By the end of this material you would have learnt about: How traits and characteristics are passed on from one generation to another The different patterns of inheritance
More informationIntroduction to Linkage Disequilibrium
Introduction to September 10, 2014 Suppose we have two genes on a single chromosome gene A and gene B such that each gene has only two alleles Aalleles : A 1 and A 2 Balleles : B 1 and B 2 Suppose we have
More information1. Understand the methods for analyzing population structure in genomes
MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population
More informationLecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011
Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Quantitative Genetics The analysis of traits whose variation is determined by both a
More informationDepartment of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University, Xi'an, China;
Title: Evaluation of genetic susceptibility of common variants in CACNA1D with schizophrenia in Han Chinese Author names and affiliations: Fanglin Guan a,e, Lu Li b, Chuchu Qiao b, Gang Chen b, Tinglin
More informationVariance Component Models for Quantitative Traits. Biostatistics 666
Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond
More informationR/qtl workshop. (part 2) Karl Broman. Biostatistics and Medical Informatics University of Wisconsin Madison. kbroman.org
R/qtl workshop (part 2) Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Example Sugiyama et al. Genomics 71:70-77, 2001 250 male
More informationMapping multiple QTL in experimental crosses
Human vs mouse Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman www.daviddeen.com
More informationConfounding, mediation and colliding
Confounding, mediation and colliding What types of shared covariates does the sibling comparison design control for? Arvid Sjölander and Johan Zetterqvist Causal effects and confounding A common aim of
More informationUnit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th
Name: Date: Block: Chapter 6 Meiosis and Mendel Section 6.1 Chromosomes and Meiosis 1. How do gametes differ from somatic cells? Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February
More informationMapping QTL to a phylogenetic tree
Mapping QTL to a phylogenetic tree Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman Human vs mouse www.daviddeen.com 3 Intercross
More informationBinomial Mixture Model-based Association Tests under Genetic Heterogeneity
Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,
More informationBackward Genotype-Trait Association. in Case-Control Designs
Backward Genotype-Trait Association (BGTA)-Based Dissection of Complex Traits in Case-Control Designs Tian Zheng, Hui Wang and Shaw-Hwa Lo Department of Statistics, Columbia University, New York, New York,
More informationBayesian construction of perceptrons to predict phenotypes from 584K SNP data.
Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Luc Janss, Bert Kappen Radboud University Nijmegen Medical Centre Donders Institute for Neuroscience Introduction Genetic
More informationBTRY 4830/6830: Quantitative Genomics and Genetics
BTRY 4830/6830: Quantitative Genomics and Genetics Lecture 23: Alternative tests in GWAS / (Brief) Introduction to Bayesian Inference Jason Mezey jgm45@cornell.edu Nov. 13, 2014 (Th) 8:40-9:55 Announcements
More informationProteomics and Variable Selection
Proteomics and Variable Selection p. 1/55 Proteomics and Variable Selection Alex Lewin With thanks to Paul Kirk for some graphs Department of Epidemiology and Biostatistics, School of Public Health, Imperial
More informationChapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype.
Chapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype. please read pages 38-47; 49-55;57-63. Slide 1 of Chapter 2 1 Extension sot Mendelian Behavior of Genes Single gene inheritance
More informationIntroduction to QTL mapping in model organisms
Human vs mouse Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] www.daviddeen.com
More informationIntroduction to QTL mapping in model organisms
Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4
More informationBS 50 Genetics and Genomics Week of Oct 3 Additional Practice Problems for Section. A/a ; B/B ; d/d X A/a ; b/b ; D/d
BS 50 Genetics and Genomics Week of Oct 3 Additional Practice Problems for Section 1. In the following cross, all genes are on separate chromosomes. A is dominant to a, B is dominant to b and D is dominant
More informationPowerful multi-locus tests for genetic association in the presence of gene-gene and gene-environment interactions
Powerful multi-locus tests for genetic association in the presence of gene-gene and gene-environment interactions Nilanjan Chatterjee, Zeynep Kalaylioglu 2, Roxana Moslehi, Ulrike Peters 3, Sholom Wacholder
More informationUnit 3 - Molecular Biology & Genetics - Review Packet
Name Date Hour Unit 3 - Molecular Biology & Genetics - Review Packet True / False Questions - Indicate True or False for the following statements. 1. Eye color, hair color and the shape of your ears can
More informationCOMBI - Combining high-dimensional classification and multiple hypotheses testing for the analysis of big data in genetics
COMBI - Combining high-dimensional classification and multiple hypotheses testing for the analysis of big data in genetics Thorsten Dickhaus University of Bremen Institute for Statistics AG DANK Herbsttagung
More informationMultiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar
Multiple regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Multiple regression 1 / 36 Previous two lectures Linear and logistic
More informationEXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?
Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 3 Exercise 3.. a. Define random mating. b. Discuss what random mating as defined in (a) above means in a single infinite population
More informationExpression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia
Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.
More informationEM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works
EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement
More informationComputational Approaches to Statistical Genetics
Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen
More informationF1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes
Name Period Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes 1. What is the chromosome theory of inheritance? 2. Explain the law of segregation. Use two different
More information(Genome-wide) association analysis
(Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by
More informationGWAS. Genotype-Phenotype Association CMSC858P Spring 2012 Hector Corrada Bravo University of Maryland. logistic regression. logistic regression
Genotype-Phenotype Association CMSC858P Spring 202 Hector Corrada Bravo University of Maryland GWAS Genome-wide association studies Scans for SNPs (or other structural variants) that show association with
More informationNature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs.
Supplementary Figure 1 Number of cases and proxy cases required to detect association at designs. = 5 10 8 for case control and proxy case control The ratio of controls to cases (or proxy cases) is 1.
More informationTilburg University. Published in: American Journal of Human Genetics. Document version: Peer reviewed version. Publication date: 2000
Tilburg University Testing for linkage disequilibrium, maternal effects, and imprinting with (in)complete case-parent triads using the computer program LEM van den Oord, E.; Vermunt, Jeroen Published in:
More information