Common Variants near MBNL1 and NKX2-5 are Associated with Infantile Hypertrophic Pyloric Stenosis

Size: px
Start display at page:

Download "Common Variants near MBNL1 and NKX2-5 are Associated with Infantile Hypertrophic Pyloric Stenosis"

Transcription

1 Supplementary Information: Common Variants near MBNL1 and NKX2-5 are Associated with Infantile Hypertrophic Pyloric Stenosis Bjarke Feenstra 1*, Frank Geller 1*, Camilla Krogh 1, Mads V. Hollegaard 2, Sanne Gørtz 1, Heather A. Boyd 1, Jeffrey C. Murray 3, David M. Hougaard 2, Mads Melbye 1 1 Department of Epidemiology Research, Statens Serum Institut, Copenhagen, Denmark 2 Section of Neonatal Screening and Hormones, Statens Serum Institut, Copenhagen, Denmark 3 Department of Pediatrics, University of Iowa, Iowa City, IA, USA * - these authors contributed equally Correspondence should be addressed to B.F. (fee@ssi.dk) or M.M. (mme@ssi.dk). 1

2 Contents: Supplementary Note: pages 3 5 Subjects, Sampling, Amplification and Genotyping, Population Stratification Analysis, Imputation, Analysis of Long-Range Haplotypes in the Chromosome 3 Region. Supplementary Tables: pages 6 17 Supplementary Table 1. Supplementary Table 2. Supplementary Table 3. Supplementary Table 4. Supplementary Table 5. Phenotype definition. Basic characteristics of the discovery and replication sample Discovery, replication and combined results for one additional SNPs at each of the six top loci associated with IHPS. Results of the loci associated with IHPS in the discovery phase, without and with correction for possible population stratification. Association results for 294 imputed SNPs with P < 10-6 across the 3p25.1, 3p25.2, and 5q35.2 loci. Supplementary Table 6. Long range haplotypes for chromomosome 3. Supplementary Table 7. Association results for IHPS stratified by sex. Supplementary Figures: pages Supplementary Figure 1. Quantile-quantile plot. Supplementary Figure 2. Regional association plot showing imputed SNP results for the first confirmed locus on chromosome 3q25.1. Supplementary Figure 3. Regional association plot showing imputed SNP results for the second confirmed locus on chromosome 3q25.2. Supplementary Figure 4. Regional association plot showing imputed SNP results for the third confirmed locus on chromosome 5q35.2. References: page 22 2

3 Supplementary Note: Subjects IHPS cases were identified based on the Danish National Patient Register, which covers all hospital discharge diagnoses and operations performed since Eligible cases were defined as children who, in their first year of life, had a pyloromyotomy according to the Danish Classification of Surgical Procedures codes up to December 1995 (International Statistical Classification of Diseases, Eighth Revision (ICD-8) codes 41840, 41841, 44100) and the Nordic Classification of Surgical Procedures codes after January 1996 (ICD-10 codes KJDH60, KJDH61). The total number of cases born between 1977 and 2008 and operated within the first year was 3,366 (more than 90% of these were operated within the first two months). We excluded multiple births and major malformations at birth according to the EUROCAT classification ( We also excluded the following pregnancy complications: placenta praevia, placental abruption, placental insufficiency, hydramnios, isoimmunization, and preeclampsia. To ensure a high degree of genetic homogeneity in the genotyped sample, we obtained birthplace information from the Danish Civil Registry 2, and only included cases who themselves as well as their parents were born in Scandinavia and whose grandparents were not born outside of Northwestern Europe. 1,048 case samples were selected for genome-wide SNP genotyping, prioritizing most recent birth years. Of these, 46 were dropped after pre-testing, 1,002 were submitted for genome-wide genotyping, and 1,001 were successfully genotyped. The control group used in this study consisted of 2,401 non-affected children from our ongoing GWAS of preterm delivery 3, with the same exclusion criteria applied. For replication, we had 796 cases and 876 controls using the same case and control definitions as in the discovery stage. Sampling, Amplification and Genotyping All samples were retrieved from the Danish Newborn Screening Biobank or from the biobank of the Danish National Birth Cohort, both of which are part of the Danish National Biobank. Sampling and genotyping of the discovery stage subjects was undertaken in two rounds. In the first round, 1,901 children were sampled from buffy coat (1,440) or dried blood spot samples (461) as a part of our preterm delivery GWAS. DNA extraction and genotyping was performed at Johns Hopkins University Center for Inherited Disease Research (CIDR). For 55 samples whole-genome amplification was performed due to low yields of genomic DNA. GWAS genotyping was done with Illumina (Illumina, San Diego, CA, USA) Human 660W-Quadv1_A chip. In the second round, all IHPS cases as well as 500 extra samples from the preterm delivery GWAS were sampled using two 3mm punches from dried blood spot samples. For these samples, DNA was extracted and wholegenome amplified at Statens Serum Institut using a protocol optimized for dried blood spot samples as previously described 4. GWAS genotyping was again performed at CIDR using the 3

4 Illumina Human 660W-Quadv1_A chip. For replication, we sampled 796 cases and 876 controls using punches of dried blood spot samples. DNA was extracted and whole-genome amplified at Statens Serum Institut. Genotyping of two correlated SNPs at each of the 6 most significantly associated loci was performed at decode Genetics using the Centaurus platform (Nanogen, Bothell, WA, USA). Population Stratification Analysis To investigate effects of possible population substructure, we performed multidimensional scaling analysis (as implemented in PLINK 5 ) on the discovery data, using independent autosomal SNPs with missing call rates < 1%, minor allele frequency > 5%, and Hardy-Weinberg P value > We utilized PLINK's LD pruning function to remove short and long-range LD. The resulting 22,766 SNPs were analyzed along with founder genotypes from 11 HapMap phase III reference populations. We included the first five dimensions as covariates in a logistic regression model, and redid the association analysis of the discovery data. There was no evidence of population substructure playing any significant role; all five dimensions had P values > 0.05 in the model, the genomic inflation factor was unchanged at 1.06, and the association results for the top loci were essentially the same (Supplementary Table 4). Imputation In order to explore the association signals further, we imputed unobserved genotypes in the three confirmed regions using phased haplotypes from 381 unrelated individuals from five populations of European ancestry obtained from the Interim Phase I release of the 1000 Genomes Project 6. We imputed 1.5Mb to either side of the three top SNPs, i.e. the entire region between the two chromosome 3 SNPs was included. Imputation was done in a two step procedure. In a first prephasing step, we used MaCH 7 to estimate haplotypes for the IHPS study samples. In a second step, we imputed missing alleles for additional SNPs directly onto these phased haplotypes using Minimac 7. All imputed SNPs with imputation quality r 2 > 0.30 were tested for association with IHPS in a logistic regression of disease status on imputed allele dosage (to account for imputation uncertainty) using mach2dat 7. In addition, we carried out conditional analyses including the genotype of the confirmed genotyped SNP as a covariate in the logistic regression model. We estimated the correlation between the top genotyped SNP and the imputed SNPs by the squared Pearson correlation coefficient between allele count of the genotyped SNP and allele dosages of 4

5 imputed SNPs. Imputation results are shown in Supplementary Table 5 and Supplementary Figures 2 4. Analysis of Long-Range Haplotypes in the Chromosome 3 Region To investigate the possibility of long-range haplotypes explaining the seemingly independent association with IHPS observed for the SNPs rs and rs573872, we analyzed the region spanning from the start of the LD block harboring rs to the end of the block containing rs ( Mb). In line with a previous study 8, we aimed at long range haplotypes in cases and therefore selected additional tag SNPs based on an association P value < 0.01 and a ratio of D cases/d controls > 1.2 (either for rs or rs573872). This lead to the construction of haplotypes based on 12 SNPs with fastphase 9. A total of 47 haplotypes had a frequency estimate > 0.5% in either cases or controls (Supplementary Table 6). Odds ratio estimates ranged from 0.4 to 2.2 and were well in line with what could be expected based on the included alleles of rs and rs

6 Supplementary Table 1. Phenotype definition. Inclusion criterion for cases: Confirmed Pyloromyotomi within the first year of life Exclusion criteria for cases and controls: Multiple births Major malformations (according to the EUROCAT classification) Pregnancy complications (placenta praevia, placental abruption, placental insufficiency, hydramnios, isoimmunization, and preeclampsia) Birthplace outside of Scandinavia, parents birthplace outside of Scandinavia, grandparents birthplace outside of North-Western Europe Supplementary Table 2. Basic characteristics of the discovery and replication sample. Study group N % boys Mean year of birth (SD) a Discovery stage 3, % 1999 (4.9) cases 1, % 1996 (6.5) controls 2, % 2001 (3.0) Replication stage 1, % 1995 (7.2) cases % 1990 (7.5) controls % 2000 (1.3) a The replication cases were born an average of 6 years earlier than discovery cases. This was by design to address the concern that blood spot samples that had been stored for many years might not give as good yield in GWAS genotyping as those stored for fewer years and therefore prioritized cases born most recently for the discovery stage. 6

7 Supplementary Table 3. Discovery, replication and combined results for one additional SNPs at each of the six top loci associated with IHPS. These additional SNPs were runner-ups in the discovery analysis and show high correlation to the primary SNPs in the combined discovery and replication sample. Alleles Discovery Replication Combined Position Frequency Number Odds Ratio Frequency Number Odds Ratio Number Odds Ratio chr SNP r 2 a (bp) Eff Alt Cases Controls Cases Controls (95% CI) P value Cases Controls Cases Controls (95% CI) P value Cases Controls (95% CI) P value Het P 3 rs G A ( ) 1.8e ( ) 4.7e ( ) 7.2e rs T C ( ) 4.3e ( ) 5.6e ( ) 2.1e rs G A ( ) 1.3e ( ) 4.7e ( ) 3.6e rs b G A ( ) 1.2e ( ) ( ) 5.8e rs A G ( ) 2.6e ( ) ( ) 2.1e rs G A ( ) 1.7e ( ) ( ) 1.4e r 2, r 2 to top SNP at locus; Eff, effect allele; Alt, alternative allele; Frequency, effect allele frequency; CI, confidence interval; Het P, P value for test of heterogeneity using the I 2 statistic. a Correlation coefficients (r 2 ) to top SNP at locus based on the combined discovery and replication sample apart from rs , where it is based on data from HapMap CEU sample. b rs was not genotyped in the discovery stage; instead discovery stage results for the (perfectly correlated) top SNP at the locus, rs , are used in the table. 7

8 Supplementary Table 4. Results of the loci associated with IHPS in the discovery phase, without and with correction for possible population stratification. Correction for population stratification was performed by including the first five dimensions in a multidimensional scaling analysis as covariates in logistic regression of IHPS disease status on SNP allele count. Discovery Discovery with five MDS dimensions Position Alleles Frequency Number Odds Ratio Odds Ratio chr SNP (bp) Eff Alt Cases Controls Cases Controls (95% CI) P value (95% CI) P value 3 rs A G ( ) 5.5e ( ) 1.6e-12 3 rs G T ( ) 3.9e ( ) 5.3e-07 5 rs A G ( ) 5.7e ( ) 1.1e-10 6 rs C T ( ) 1.2e ( ) 1.6e rs G T ( ) 8.5e ( ) 1.3e rs T C ( ) 4.1e ( ) 6.3e-07 MDS, multi-dimensional scaling; Eff, effect allele; Alt, alternative allele; Frequency, effect allele frequency; CI, confidence interval 8

9 Supplementary Table 5. Association results for 294 imputed SNPs with P < 10-6 across the 3p25.1, 3p25.2, and 5q35.2 loci. The table is sorted by basepair position (NCBI build 37) and shows effect (Eff) and alternative (Alt) allele; effect allele frequency; r 2 value from MaCH indicating imputation quality; odds ratio for association with IHPS; P value (after genomic control, λ = 1.06 from discovery scan); indicator of whether the SNP was genotyped; squared Pearson correlation coefficient (r 2 ) or imputed SNP allele dosage to allele count for the top genotyped SNP at the locus (rs , rs573872, or rs29784); function class and gene name for SNPs in genes. Results for the three loci are separated by bold lines. chr SNP Position (bp) Alleles Eff Alt Effect Allele Freq Imputation r 2 Odds Ratio P value Genotyped r 2 Class Gene 3 rs T C e rs C T e rs A G e rs A G e rs G A e rs A G e rs T C e rs T G e rs A C e rs T G e rs C G e rs A T e rs G C e rs A G e rs A G e rs C T e rs G T e rs T A e rs C T e rs C A e rs A C e rs A C e rs T A e rs T C e rs G A e rs T C e rs G A e rs C G e rs A G e rs C A e rs T C e rs A G e rs T C e rs G T e rs G C e rs G A e rs T C e rs C A e

10 3 rs A T e rs G A e rs A G e rs A T e rs T G e rs T C e rs G A e rs C T e rs G A e rs C T e rs G C e rs T G e rs C T e rs A G e rs A T e rs T C e rs T C e rs C T e rs T C e rs T A e rs G A e rs T G e rs T G e rs T C e rs G A e rs A G e rs T G e rs G A e rs C T e rs G A e rs G C e rs A G e rs C A e rs C T e rs A G e rs G C e rs C T e rs A G e rs G A e rs C T e rs T C e rs G A e rs G A e rs C T e rs G A e rs T C e rs A T e rs G T e rs C T e rs T C e rs T A e rs T C e rs A G e

11 3 rs C T e rs C A e rs C T e rs G A e rs G A e rs C T e rs T C e rs T C e rs A G e rs T A e rs T A e rs A T e rs T C e rs A T e rs A T e rs A G e rs T G e rs T C e rs A C e rs T C e rs C T e rs T C e rs A T e rs T C e rs C A e rs A G e rs T C e rs C G e rs T C e rs T C e rs G C e rs A C e rs A C e rs A G e rs C T e rs G A e rs C G e rs C A e rs T C e rs T A e rs G A e rs G A e rs G A e rs T C e rs T C e rs A C e rs A G e rs A C e rs G T e rs A G e rs A G e rs A G e rs C T e

12 3 rs A G e rs T C e rs T C e rs A G e rs G A e rs A G e rs A T e rs G A e rs G A e rs T G e rs T G e rs A G e rs C T e rs G T e rs A G e rs A C e rs A T e rs A C e rs C T e rs T G e rs C T e rs T C e rs A T e rs A T e rs G C e rs G C e rs A C e rs A G e rs G A e rs G T e rs G A e rs C T e rs G A e rs G A e rs C T e rs T C e rs A G e rs G C e rs T A e rs T C e rs C T e rs C T e rs T G e rs T C e rs C T e rs T G e rs C T e rs G A e rs G C e rs T C e rs A G e rs C T e rs A C e

13 3 rs C A e rs T A e rs A C e rs A C e rs A G e rs T C e rs G A e rs T A e rs T C e rs T C e rs T C e rs C G e upstream-variant-2kb C5orf41 5 rs C G e upstream-variant-2kb C5orf41 5 rs T C e intron-variant C5orf41 5 rs A G e intron-variant C5orf41 5 rs T C e intron-variant C5orf41 5 rs A G e intron-variant C5orf41 5 rs A G e intron-variant C5orf41 5 rs G A e intron-variant C5orf41 5 rs T C e intron-variant C5orf41 5 rs T C e intron-variant C5orf41 5 rs T C e intron-variant C5orf41 5 rs T C e intron-variant C5orf41 5 chr5: T C e intron-variant C5orf41 5 rs T A e rs C T e upstream-variant-2kb BNIP1 5 rs G A e intron-variant BNIP1 5 rs A G e intron-variant BNIP1 5 rs T A e intron-variant BNIP1 5 rs T C e intron-variant BNIP1 5 rs T C e intron-variant BNIP1 5 rs T C e intron-variant BNIP1 5 rs A G e intron-variant BNIP1 5 rs T G e intron-variant BNIP1 5 rs A G e rs G A e rs T C e rs C T e rs C A e rs T C e rs T C e rs T C e rs A G e rs C T e rs C T e rs C T e rs T C e rs T C e rs A G e rs C T e rs A G e rs T A e rs A G e

14 5 rs G C e rs G T e rs C A e rs C A e rs C T e chr5: T C e rs C T e rs T C e rs C T e rs G A e rs G A e rs G A e rs A G e rs C A e rs T C e rs A G e rs T C e rs T C e rs G C e rs G C e rs G A e rs T C e rs C T e rs A T e rs C T e rs A T e rs A T e rs T C e rs T G e rs G A e rs T C e rs G C e rs G C e rs G T e rs G A e rs C A e rs C T e rs G A e rs T G e rs A G e rs A C e rs C T e rs C G e

15 Supplementary Table 6. Long range haplotypes for chromomosome 3. Haplotype Freq. in cases SE Freq. in controls SE OR

16 Legend red haplotypes: both risk alleles rs a rs g blue haplotypes: both non-risk alleles rs g rs t bold haplotypes have a frequency > 2% in at least one group SNP position and alleles Position bp SNP 0 allele 1 allele rs A G rs A G rs A G rs G A rs T C rs G T rs A G rs G A rs T C rs G A rs C T rs G T 16

17 Supplementary Table 7. Association results for IHPS stratified by sex. Results are given separately for the discovery data, the replication data, and the discovery and replication data combined by meta-analysis. Alleles Boys Girls Heterogeneity Position Frequency Number Odds Ratio Frequency Number Odds Ratio chr SNP (bp) Eff Alt Cases Controls Cases Controls (95% CI) P value Cases Controls Cases Controls (95% CI) P value Het P Discovery 3 rs A G ( ) 4.4e ( ) 4.5e rs G T ( ) 2.0e ( ) 8.7e rs A G ( ) 4.5e ( ) 6.4e rs C T ( ) 9.7e ( ) 3.8e rs G T ( ) 2.3e ( ) 2.8e rs T C ( ) 1.7e ( ) e-03 Replication 3 rs A G ( ) 1.7e ( ) 4.8e rs G T ( ) 1.6e ( ) 1.1e rs A G ( ) 1.8e ( ) 9.8e rs C T ( ) ( ) rs G T ( ) ( ) rs T C ( ) ( ) Combined 3 rs A G ( ) 2.5e ( ) 1.8e rs G T ( ) 2.2e ( ) 4.5e rs A G ( ) 2.2e ( ) 2.1e rs C T ( ) 4.7e ( ) 3.5e rs G T ( ) 1.6e ( ) 3.5e-04 b rs T C ( ) 7.6e-08 a ( ) e-03 Eff, effect allele; Alt, alternative allele; Frequency, effect allele frequency; CI, confidence interval; Het P, P value for test of heterogeneity between results for boys and girls using the I 2 statistic. a In the combined results for boys there was significant heterogeneity between discovery and replication results for rs (P = 0.004). b In the combined results for girls there was significant heterogeneity between discovery and replication results for rs (P = 0.002). 17

18 Supplementary Figure 1. Quantile-quantile plot of observed versus expected log 10 P values from the genome-wide scan for IHPS after correction by genomic control (λ = 1.06). The expected distribution of log 10 P values under the null hypothesis is shown by the grey line. 18

19 Supplementary Figure 2. Regional association plot showing imputed SNP results for the first confirmed locus on chromosome 3q25.1. SNPs were imputed using the Interim Phase I haplotype release (June 2011) of the European samples from the 1000 Genomes Project as reference. The figure shows results of: a) logistic regression of disease status on single SNP allele dosage, and b) conditional analysis including the genotype of rs as a covariate in the logistic regression model. SNPs are plotted by chromosomal position (NCBI build 37) against log 10 P value of IHPS association. The colors reflect LD (based on pairwise r 2 values from the 1000 Genomes project. Red: r 2 > 0.8; orange: 0.6 < r 2 < 0.8; green: 0.4 < r 2 < 0.6; light blue: 0.2 < r 2 < 0.4; purple: r 2 < 0.2) to the confirmed genotyped SNP at the locus. Estimated recombination rates (from HapMap) are plotted to reflect the local LD structure. Genes are indicated in the lower panel of the plot. The figure was generated using LocusZoom

20 Supplementary Figure 3. Regional association plot showing imputed SNP results for the second confirmed locus on chromosome 3q25.2. SNPs were imputed using the Interim Phase I haplotype release (June 2011) of the European samples from the 1000 Genomes Project as reference. The figure shows results of: a) logistic regression of disease status on single SNP allele dosage, and b) conditional analysis including the genotype of rs as a covariate in the logistic regression model. Axes and symbols are defined as in Supplementary Figure 2 and the same color coding is used to reflect LD to the confirmed genotyped SNP at the locus. The figure was generated using LocusZoom

21 Supplementary Figure 4. Regional association plot showing imputed SNP results for the third confirmed locus on chromosome 5q35.2. SNPs were imputed using the Interim Phase I haplotype release (June 2011) of the European samples from the 1000 Genomes Project as reference. The figure shows results of: a) logistic regression of disease status on single SNP allele dosage, and b) conditional analysis including the genotype of rs29784 as a covariate in the logistic regression model. Axes and symbols are defined as in Supplementary Figure 2 and the same color coding is used to reflect LD to the confirmed genotyped SNP at the locus. The figure was generated using LocusZoom

22 References 1. Andersen,T.F., Madsen,M., Jorgensen,J., Mellemkjoer,L., & Olsen,J.H. The Danish National Hospital Register. A valuable source of data for modern health sciences. Dan. Med. Bull. 46, (1999). 2. Pedersen,C.B., Gotzsche,H., Moller,J.O., & Mortensen,P.B. The Danish Civil Registration System. A cohort of eight million persons. Dan. Med. Bull. 53, (2006). 3. Cornelis,M.C. et al. The Gene, Environment Association Studies consortium (GENEVA): maximizing the knowledge obtained from GWAS by collaboration across studies of multiple conditions. Genet. Epidemiol. 34, (2010). 4. Hollegaard,M.V. et al. Genome-wide scans using archived neonatal dried blood spot samples. BMC. Genomics 10, 297 (2009). 5. Purcell,S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, (2007). 6. A map of human genome variation from population-scale sequencing. Nature 467, (2010). 7. Li,Y., Willer,C.J., Ding,J., Scheet,P., & Abecasis,G.R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet. Epidemiol. 34, (2010). 8. Wang,K. et al. Interpretation of association signals and identification of causal variants from genome-wide association studies. Am. J. Hum. Genet. 86, (2010). 9. Scheet,P. & Stephens,M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, (2006). 10. Pruim,R.J. et al. LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics. 26, (2010). 22

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs.

Nature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs. Supplementary Figure 1 Number of cases and proxy cases required to detect association at designs. = 5 10 8 for case control and proxy case control The ratio of controls to cases (or proxy cases) is 1.

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo

Friday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization

More information

SNP Association Studies with Case-Parent Trios

SNP Association Studies with Case-Parent Trios SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature

More information

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,

More information

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Ruth Pfeiffer, Ph.D. Mitchell Gail Biostatistics Branch Division of Cancer Epidemiology&Genetics National

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Supplementary Figures

Supplementary Figures Supplementary Figures Supplementary Figure 1 Principal components analysis (PCA) of all samples analyzed in the discovery phase. Colors represent the phenotype of study populations. a) The first sample

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015 Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

Linkage and Linkage Disequilibrium

Linkage and Linkage Disequilibrium Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:

More information

Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials

Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials Biostatistics (2013), pp. 1 31 doi:10.1093/biostatistics/kxt006 Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials XINYI LIN, SEUNGGUEN

More information

1. Understand the methods for analyzing population structure in genomes

1. Understand the methods for analyzing population structure in genomes MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population

More information

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8 The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as

More information

(Genome-wide) association analysis

(Genome-wide) association analysis (Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by

More information

Figure E1 Manhattan and QQ plots for FEV 1 meta-analyses across all Hispanic ancestry groups

Figure E1 Manhattan and QQ plots for FEV 1 meta-analyses across all Hispanic ancestry groups Figure E1 Manhattan and QQ plots for FEV 1 meta-analyses across all Hispanic ancestry groups Figure E1a: All Participants λgc = 1. Figure E1b: Ever Smokers λgc = 1. Figure E1c: Never Smokers λgc = 1.1

More information

Introduction to Linkage Disequilibrium

Introduction to Linkage Disequilibrium Introduction to September 10, 2014 Suppose we have two genes on a single chromosome gene A and gene B such that each gene has only two alleles Aalleles : A 1 and A 2 Balleles : B 1 and B 2 Suppose we have

More information

p(d g A,g B )p(g B ), g B

p(d g A,g B )p(g B ), g B Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)

More information

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES: .5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations

Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations Yi-Juan Hu a, Yun Li b,c, Paul L. Auer d, and Dan-Yu Lin b,1 a Department of Biostatistics

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

Genotype Imputation. Class Discussion for January 19, 2016

Genotype Imputation. Class Discussion for January 19, 2016 Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously

More information

Statistical issues in QTL mapping in mice

Statistical issues in QTL mapping in mice Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

Methods for Cryptic Structure. Methods for Cryptic Structure

Methods for Cryptic Structure. Methods for Cryptic Structure Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

Supporting Information

Supporting Information Supporting Information Hammer et al. 10.1073/pnas.1109300108 SI Materials and Methods Two-Population Model. Estimating demographic parameters. For each pair of sub-saharan African populations we consider

More information

Improved linear mixed models for genome-wide association studies

Improved linear mixed models for genome-wide association studies Nature Methods Improved linear mixed models for genome-wide association studies Jennifer Listgarten, Christoph Lippert, Carl M Kadie, Robert I Davidson, Eleazar Eskin & David Heckerman Supplementary File

More information

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Andreas Sundquist*, Eugene Fratkin*, Chuong B. Do, Serafim Batzoglou Department of Computer Science, Stanford University, Stanford,

More information

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA)

HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) BIRS 016 1 HERITABILITY ESTIMATION USING A REGULARIZED REGRESSION APPROACH (HERRA) Malka Gorfine, Tel Aviv University, Israel Joint work with Li Hsu, FHCRC, Seattle, USA BIRS 016 The concept of heritability

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure) Previous lecture Single variant association Use genome-wide SNPs to account for confounding (population substructure) Estimation of effect size and winner s curse Meta-Analysis Today s outline P-value

More information

On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease

On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease On the limiting distribution of the likelihood ratio test in nucleotide mapping of complex disease Yuehua Cui 1 and Dong-Yun Kim 2 1 Department of Statistics and Probability, Michigan State University,

More information

Detecting selection from differentiation between populations: the FLK and hapflk approach.

Detecting selection from differentiation between populations: the FLK and hapflk approach. Detecting selection from differentiation between populations: the FLK and hapflk approach. Bertrand Servin bservin@toulouse.inra.fr Maria-Ines Fariello, Simon Boitard, Claude Chevalet, Magali SanCristobal,

More information

Classical Selection, Balancing Selection, and Neutral Mutations

Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected

More information

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Overview - 1 Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Elizabeth Thompson University of Washington Seattle, WA, USA MWF 8:30-9:20; THO 211 Web page: www.stat.washington.edu/ thompson/stat550/

More information

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies.

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies. Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus

More information

Learning ancestral genetic processes using nonparametric Bayesian models

Learning ancestral genetic processes using nonparametric Bayesian models Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew

More information

Statistical Methods in Mapping Complex Diseases

Statistical Methods in Mapping Complex Diseases University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Summer 8-12-2011 Statistical Methods in Mapping Complex Diseases Jing He University of Pennsylvania, jinghe@mail.med.upenn.edu

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Analyzing metabolomics data for association with genotypes using two-component Gaussian mixture distributions

Analyzing metabolomics data for association with genotypes using two-component Gaussian mixture distributions Analyzing metabolomics data for association with genotypes using two-component Gaussian mixture distributions Jason Westra Department of Statistics, Iowa State University Ames, IA 50011, United States

More information

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda 1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;

More information

Overview of the Ibis SNP Assay

Overview of the Ibis SNP Assay Thomas Hall, Ph.D. Overview of the Ibis SNP ssay Objective PR/ESI MS based assay for human autosomal SNP analysis Exclude non contributors to a DN sample random profile match should have very low probability

More information

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test The Harvard community has made this article openly available Please share how this access

More information

Causal Inference for Binary Outcomes

Causal Inference for Binary Outcomes Causal Inference for Binary Outcomes Applied Health Econometrics Symposium Leeds October 2013 Frank Windmeijer University of Bristol Outline: 1. Instrumental variables estimators for binary outcomes Structural

More information

Figure S2. The distribution of the sizes (in bp) of syntenic regions of humans and chimpanzees on human chromosome 21.

Figure S2. The distribution of the sizes (in bp) of syntenic regions of humans and chimpanzees on human chromosome 21. Frequency 0 1000 2000 3000 4000 5000 0 2 4 6 8 10 Distance Figure S1. The distribution of human-chimpanzee sequence divergence for syntenic regions of humans and chimpanzees on human chromosome 21. Distance

More information

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables Causal inference in biomedical sciences: causal models involving genotypes Causal models for observational data Instrumental variables estimation and Mendelian randomization Krista Fischer Estonian Genome

More information

Asymptotic distribution of the largest eigenvalue with application to genetic data

Asymptotic distribution of the largest eigenvalue with application to genetic data Asymptotic distribution of the largest eigenvalue with application to genetic data Chong Wu University of Minnesota September 30, 2016 T32 Journal Club Chong Wu 1 / 25 Table of Contents 1 Background Gene-gene

More information

Package LBLGXE. R topics documented: July 20, Type Package

Package LBLGXE. R topics documented: July 20, Type Package Type Package Package LBLGXE July 20, 2015 Title Bayesian Lasso for detecting Rare (or Common) Haplotype Association and their interactions with Environmental Covariates Version 1.2 Date 2015-07-09 Author

More information

Genetics Studies of Multivariate Traits

Genetics Studies of Multivariate Traits Genetics Studies of Multivariate Traits Heping Zhang Department of Epidemiology and Public Health Yale University School of Medicine Presented at Southern Regional Council on Statistics Summer Research

More information

Mapping multiple QTL in experimental crosses

Mapping multiple QTL in experimental crosses Human vs mouse Mapping multiple QTL in experimental crosses Karl W Broman Department of Biostatistics & Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman www.daviddeen.com

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary

More information

Computational Approaches to Statistical Genetics

Computational Approaches to Statistical Genetics Computational Approaches to Statistical Genetics GWAS I: Concepts and Probability Theory Christoph Lippert Dr. Oliver Stegle Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen

More information

Genetics Studies of Comorbidity

Genetics Studies of Comorbidity Genetics Studies of Comorbidity Heping Zhang Department of Epidemiology and Public Health Yale University School of Medicine Presented at Science at the Edge Michigan State University January 27, 2012

More information

EM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works

EM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement

More information

MODULE NO.22: Probability

MODULE NO.22: Probability SUBJECT Paper No. and Title Module No. and Title Module Tag PAPER No.13: DNA Forensics MODULE No.22: Probability FSC_P13_M22 TABLE OF CONTENTS 1. Learning Outcomes 2. Introduction 3. Laws of Probability

More information

An Integrated Approach for the Assessment of Chromosomal Abnormalities

An Integrated Approach for the Assessment of Chromosomal Abnormalities An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 6, 2007 Karyotypes Mitosis and Meiosis Meiosis Meiosis

More information

MRC-Holland MLPA. Description version 14; 21 January 2015

MRC-Holland MLPA. Description version 14; 21 January 2015 SALSA MLPA probemix P229-B2 OPA1 Lot B2-0412. As compared to version B1-0809, two reference probes and the 88 and 96 nt control fragments have been replaced (QDX2). The OPA1 gene product is a nuclear-encoded

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

The genomes of recombinant inbred lines

The genomes of recombinant inbred lines The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)

More information

SNP-SNP Interactions in Case-Parent Trios

SNP-SNP Interactions in Case-Parent Trios Detection of SNP-SNP Interactions in Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 2, 2009 Karyotypes http://ghr.nlm.nih.gov/ Single Nucleotide Polymphisms

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April

More information

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th Name: Date: Block: Chapter 6 Meiosis and Mendel Section 6.1 Chromosomes and Meiosis 1. How do gametes differ from somatic cells? Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February

More information

Population Genetics I. Bio

Population Genetics I. Bio Population Genetics I. Bio5488-2018 Don Conrad dconrad@genetics.wustl.edu Why study population genetics? Functional Inference Demographic inference: History of mankind is written in our DNA. We can learn

More information

Outline for today s lecture (Ch. 14, Part I)

Outline for today s lecture (Ch. 14, Part I) Outline for today s lecture (Ch. 14, Part I) Ploidy vs. DNA content The basis of heredity ca. 1850s Mendel s Experiments and Theory Law of Segregation Law of Independent Assortment Introduction to Probability

More information

Guided Notes Unit 6: Classical Genetics

Guided Notes Unit 6: Classical Genetics Name: Date: Block: Chapter 6: Meiosis and Mendel I. Concept 6.1: Chromosomes and Meiosis Guided Notes Unit 6: Classical Genetics a. Meiosis: i. (In animals, meiosis occurs in the sex organs the testes

More information

Principles of Genetics

Principles of Genetics Principles of Genetics Snustad, D ISBN-13: 9780470903599 Table of Contents C H A P T E R 1 The Science of Genetics 1 An Invitation 2 Three Great Milestones in Genetics 2 DNA as the Genetic Material 6 Genetics

More information

A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding

A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Professur Pflanzenzüchtung Professur Pflanzenzüchtung A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Jens Léon 4. November 2014, Oulu Workshop

More information

Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University, Xi'an, China;

Department of Forensic Psychiatry, School of Medicine & Forensics, Xi'an Jiaotong University, Xi'an, China; Title: Evaluation of genetic susceptibility of common variants in CACNA1D with schizophrenia in Han Chinese Author names and affiliations: Fanglin Guan a,e, Lu Li b, Chuchu Qiao b, Gang Chen b, Tinglin

More information

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype. Series 1: Cross Diagrams There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome: When both

More information

An Integrated Approach for the Assessment of Chromosomal Abnormalities

An Integrated Approach for the Assessment of Chromosomal Abnormalities An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 26, 2007 Karyotypes Karyotypes General Cytogenetics

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs

More information

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele

More information

X Chromosome Association Testing in Genome-Wide Association Studies

X Chromosome Association Testing in Genome-Wide Association Studies X Chromosome Association Testing in Genome-Wide Association Studies Honours Thesis November 6, 009 Peter Hickey Department of Mathematics and Statistics, The University of Melbourne Bioinformatics Division,

More information

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies Confounding in gene+c associa+on studies q What is it? q What is the effect? q How to detect it?

More information

Haplotyping. Biostatistics 666

Haplotyping. Biostatistics 666 Haplotyping Biostatistics 666 Previously Introduction to te E-M algoritm Approac for likeliood optimization Examples related to gene counting Allele frequency estimation recessive disorder Allele frequency

More information

Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles

Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles Supplemental Information Likelihood-based inference in isolation-by-distance models using the spatial distribution of low-frequency alleles John Novembre and Montgomery Slatkin Supplementary Methods To

More information

Package ESPRESSO. August 29, 2013

Package ESPRESSO. August 29, 2013 Package ESPRESSO August 29, 2013 Type Package Title Power Analysis and Sample Size Calculation Version 1.1 Date 2011-04-01 Author Amadou Gaye, Paul Burton Maintainer Amadou Gaye The package

More information

SNP Association Studies with Case-Parent Trios

SNP Association Studies with Case-Parent Trios SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health July, 00 Acknowledgments Collaborators: Qing Li, Rob Scharpf, Holger Schwender,

More information

A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction

A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction Sangseob Leem, Hye-Young Jung, Sungyoung Lee and Taesung Park Bioinformatics and Biostatistics lab

More information

MOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES

MOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES MOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES Patricia E Klein, Mandy Yan, Ellen Young, Jeekin Lau, Stella Kang, Natalie Patterson, Natalie Anderson and David Byrne Department of Horticultural Sciences,

More information

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information #

Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Information # Bustamante et al., Supplementary Nature Manuscript # 1 out of 9 Details of PRF Methodology In the Poisson Random Field PRF) model, it is assumed that non-synonymous mutations at a given gene are either

More information

Lecture 9. QTL Mapping 2: Outbred Populations

Lecture 9. QTL Mapping 2: Outbred Populations Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred

More information

Supplementary Figure 1. Phenotype of the HI strain.

Supplementary Figure 1. Phenotype of the HI strain. Supplementary Figure 1. Phenotype of the HI strain. (A) Phenotype of the HI and wild type plant after flowering (~1month). Wild type plant is tall with well elongated inflorescence. All four HI plants

More information

Gene mapping, linkage analysis and computational challenges. Konstantin Strauch

Gene mapping, linkage analysis and computational challenges. Konstantin Strauch Gene mapping, linkage analysis an computational challenges Konstantin Strauch Institute for Meical Biometry, Informatics, an Epiemiology (IMBIE) University of Bonn E-mail: strauch@uni-bonn.e Genetics an

More information

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.

GENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS. !! www.clutchprep.com CONCEPT: HISTORY OF GENETICS The earliest use of genetics was through of plants and animals (8000-1000 B.C.) Selective breeding (artificial selection) is the process of breeding organisms

More information

Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study

Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study Rui Wang, Yong Li, XiaoFeng Wang, Haixu Tang and Xiaoyong Zhou Indiana University at Bloomington

More information