1. Understand the methods for analyzing population structure in genomes

Size: px
Start display at page:

Download "1. Understand the methods for analyzing population structure in genomes"

Transcription

1 MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population structure in genomes 2. Understand the methods for identifying disease loci in genomes 3. Explore the approach for identifying structure variants in genomes What to hand in. One report (in pdf format) addressing each of following questions including the figures generated by R when appropriate. All source code for the R exercises. We should be able to run the source code and produce the figures requested. Submit a zip file containing the completed code (if any) and the pdf file (if any) to autolab. The zip file should have the following structure./s2016hw3.pdf./q3/ put all codes related to Q3 here, if any

2 1. [15 points] Hardy-Weinberg Equilibrium (a) (5 points) Show that the Hardy-Weinberg equilibrium holds for three alleles. [Hint: Assume allele frequencies p, q, and r (p + q + r = 1) for each of the three alleles A 1, A 2, and A 3.] Based on the allele frequencies, we could get the genotype frequencies in the offspring. For genotype A 1 A 2, P (A 1 A 2 ) = pq + qp = 2pq The frequencies of all possible genotypes could be found in the following table, Genotype Frequency A 1 A 1 p 2 A 1 A 2 2pq A 1 A 3 2pr A 2 A 2 q 2 A 2 A 3 2qr A 3 A 3 r 2 Based on these phenotype frequencies, we could calculate the allele frequencies p, q and r in the offspring. p = 2p2 + 2pq + 2pr 2 q = 2q2 + 2pq + 2qr 2 r = 2r2 + 2pr + 2qr 2 = p = q = r Thus the Hardy-Weinberg equilibrium holds for three alleles. (b) (5 points) The numbers of individuals with genotypes AA, Aa, and aa at a locus are given as 232, 36, and 6, respectively. Perform a chi-square test to see if the Hardy-Weinberg Equilibrium holds for this locus at significance level α = Use the degree of freedom 1. We use p to represent the allele frequency for A, and q for a. The total number of observations is = p = = q = = Then we calculate the expected genotype frequencies. Afterwards we calculate the χ 2 statistics. E(AA) = = E(aa) = = 2.1 E(Aa) = = 44.0 χ 2 ( )2 = = (6 2.1) (36 44)2 44 By checking the χ 2 distribution table, we could find χ ,df=1 = Since 8.77 > 3.841, we reject the null hypothesis and the Hardy-Weinberg Equilibrium doesn t hold. 2

3 (c) (5 points) The Write-Fisher model as illustrated in the lecture note can be considered as a Markov chain. If we denote the number of allele A in the population in generation n by X n, then we recognize that the sequence X 0, X 1,..., is a Markov chain, the set of possible outcomes being {0, 1, 2,..., 2N}. The transition matrix of the chain is given by a binomial distribution B(2N, i/2n): ( ) 2N ( i ) j ( p ij = p(x n+1 = j X n = i) = 1 i ) 2N j j 2N 2N Show that E(X n+1 X n = i) = i. How does it relate to Hardy-Weinberg equilibrium? i Since p(x n+1 X n = i) B(2N, 2N ), E(X n+1 X n = i) = 2N i 2N = i. Since E(X n+1 X n = i) = X n, the Hardy-Weinberg equilibrium holds for the expected frequencies. 2. [5 points] HMM in PHASE and STRUCTURE Assuming K ancestral chromosomes, the transition probabilities in the hidden Markov models in PHASE as well as those embedded in the linkage model extension of STRUCTURE model the presence/abscence of recombination events between locus l and locus l + 1 with distance d l. The transition probabilities from ancestral chromosome state labels z l = k to z l+1 = k for k, k {1,..., K} are given as { P (z l+1 = k exp( d l r) + (1 exp( d l r))q k if k = k z l = k) = (1 exp( d l r))q k otherwise where r is the per-basepair recombination rate and q i s for i = 1,..., K are prior probabilities for each of the K states that sum to 1. Consider the case where the underlying genome block structure has z l = z l+1 = k but has a small segment from the m k ancestral chromosome inserted between loci l and l + 1. How is this scenario modeled by the transition probabilities above? We use R to stand for the number of recombination events between loci l and l + 1, which follow a Poisson distribution. Since r is the per-basepair recombination rate and the distance between locus l and locus l + 1 is d l. The mean value of the Poisson distribution is d l r. Thus, the density function of the Poisson distribution is as follows, p(r) = (d lr) R e d lr R! Suppose z l = z l+1 = k, P (z l+1 = k z l = k) could be calculated as follows, P (z l+1 = k z l = k) = P (R = 0) + P (R > 0) (1) = p(r = 0) + P (R = 1) + P (R = 2) + P (R = 3) + (2) K K K = p(r = 0) + p(r = 1)q k + p(r = 2) q i q k + p(r = 3) q i q j q k + (3) i=1 i=1 j=1 K = p(r = 0) + p(r = 1)q k + p(r = 2)q k q i + p(r = 3)q k K i=1 i=1 j=1 K q i q j + (4) K = p(r = 0) + p(r = 1)q k + p(r = 2)q k + p(r = 3)q k + (Because q i = 1) = p(r = 0) + q k (1 p(r = 0)) (6) = e dlr + (1 e dlr )q k (7) i=1 (5) 3

4 If z l = z l+1 = k and there is one small segment from the m k ancestral chromosome inserted between loci l and l + 1, the probability of this scenario could be calculated more explicitly. P (one small insertion) = p(r = 2)q m q k, m k The probability above is a fraction of the term P (R = 2) in equation (2). 3. [10 points] PCA and Population Structure Consider the SNP genotype data for 5912 loci on chromosome 2 from 423 individuals provided with this homework in file snp.txt. Each of the individuals are from one of the following six populations: CEU: Utah residents with Northern and Western European ancestry from the CEPH collection CHB: Han Chinese in Beijing, China JPT: Japanese in Tokyo, Japan LWK: Luhya in Webuye, Kenya MEX: Mexican ancestry in Los Angeles, California YRI: Yoruba in Ibadan, Nigeria The ancestry labels for each individual are provided in file sample names with population labels.txt. (a) (5 points) Perform PCA and plot the ancestry of the individuals on 2 dimensions using the first two principal components, as was discussed in the class. Use different colors for different true ancestry to plot the individuals in 2 dimensions after PCA. Include your plot and code. If you perform PCA on the snp matrix without scaling, or perform PCA on the covariance matrix constructed by 1 n X X without scaling 0.04 Populations CEU PC CHB JPT LWK MEX YRI PC1 4

5 If you perform PCA on the covariance matrix constructed by cov() function, 0.05 Populations CEU CHB PC2 JPT LWK 0.00 MEX YRI PC1 If you scale the original snp matrix and perform PCA on it, 20 Populations CEU PC2 0 CHB JPT LWK MEX YRI PC1 5

6 library ( ggplot2 ) popdata <- read. table (" sample_names_with_population_labels. txt ", header = FALSE ) colnames ( popdata ) <-c(" sample.id "," pop_code ") snpdata <- read. table (" snp. txt ", header = FALSE ) pcadata <- prcomp (t( snpdata ), center =TRUE, scale = TRUE ) tmppcadata <- cbind (as. data. frame ( pcadata$x [,1:2]), popdata$pop_code ) colnames ( tmppcadata ) <- c(" PC1 "," PC2 "," Populations ") tmppcadata$populations <- factor ( tmppcadata$populations ) p <- ggplot ( tmppcadata, aes (x=pc1,y=pc2, colour = Populations )) p+ geom_point ( size =2) If you scale the original snp matrix and perform PCA on the covaraince matrix constructed by 1 n X X, 0.05 Populations 0.00 CEU CHB PC2 JPT LWK 0.05 MEX YRI PC1 (b) (5 points) Which ethnic groups are similar in terms of their genomes? Which ethnic groups are different in terms of their genomes? From the plot, we could find there are three clusters. Each of them is formed by two ethnic group. Ethnic groups fall in different clusters are quite different. (1) The CHB (China) and JPT (Japan) groups overlap with each other pretty well. (2)The majority of LWK (Kenya) and YRI (Nigera) groups overlaps. (3)The CEU (Utah, European ancestry) and MEX (California, Mexican ancestry) groups share only a small intersection. 6

7 Any other pairs of ethnic groups are very different in terms of their genomes. 4. [10 points] Linkage Analysis Compute the probabilities of the following pedigrees assuming Penetrance model is p(affected dd) = 0.1, p(affected Dd) = 0.2, p(affected DD) = 0.7. The allele frequency of D is 0.02 Shaded means affected, blank means unaffected (a) (5 points) Since the allele frequency of D is 0.02, the allele frequency of d is = Further we could calculate the phenotype frequencies. P (DD) = = P (dd) = = P (Dd) = = From the genotypes of the offspring, we could infer the phenotype of M1 could be dd or Dd. P (pedigree M1 is Dd) = P (Dd)P (Dd)P (dd Dd, Dd)P (Dd Dd, Dd) P (unaffected Dd) P (affected Dd)P (unaffected dd)p (affected Dd) = (1 0.2) 0.2 (1 0.1) 0.2 = P (pedigree M1 is dd) = P (dd)p (Dd)P (dd dd, Dd)P (Dd dd, Dd) P (unaffected dd) P (affected Dd)P (unaffected dd)p (affected Dd) = (1 0.1) 0.2 (1 0.1) 0.2 = Sum up these two probabilities and we could get the probability of the pedigree. p pedigree = P (M1 is Dd) + P (M1 is dd) =

8 (b) (5 points) From the genotypes of the offspring, we could infer the only possible phenotype of M1 is Dd. P pedigree = P (pedigree M1 is Dd) = P (Dd)P (Dd)P (dd Dd, Dd)P (Dd dd, Dd)P (DD Dd, Dd) P (unaffected Dd) P (affected Dd)P (unaffected dd)p (unaffected Dd)P (affected DD) = (1 0.2) 0.2 (1 0.1) (1 0.2) 0.7 = [23 points] Genome-wide Association Studies (a) (5 points) Given the following data, perform chi-square tests to test the association between a given locus and case/control status. Control Case Major allele homozygous heterozygous Minor allele homozygous The null hypothesis H 0 is that there is no association between a given locus and case/control status. Suppose the two alleles here are A (major) and a (minor). The total number of control samples is 126 and the total number of case samples is 125. We first calculate the allele frequency under the null hypothesis. The total number of major allele homozygous, heterozygous and minor allele homozygous are 85, 76 and 90 correspondingly P (A) = 2 ( ) = P (a) = 2 ( ) = 0.51 Allele based The observed allele count table is as follows, Control Case Major allele (A) = = 75 Minor allel(a) = = 175 The expected allele count table is as follows, 8

9 Control Case Major allele (A) = = Minor allel(a) = = Then we calculate the χ 2 test statistics, χ 2 ( )2 = = ( ) By checking the χ 2 distribution table, we could find χ ,df=1 = Since 71.97>3.84, we reject the null hypothesis and there is an association between a given locus and case/control status. Genotype based Control Case Major allele homozygous = = heterozygous = = Minor allele homozygous = = Then we calculate the χ 2 test statistics, χ 2 ( )2 = = ( ) By checking the χ 2 distribution table, we could find χ ,df=2 = Since 52.07>5.99, we reject the null hypothesis and there is an association between a given locus and case/control status. Allele+Genotype based Although you could get the same answer, this is not the right way to do it. Because we don t know whether Hardy-Weinberg Equilibrium holds for current generation or not. (b) (3 points) Assuming the chi-square test in (a) above is one of 100,000 loci that were tested for associations. What is the adjusted p-value after Bonferroni correction? Allele based The p-value for the χ 2 test statistics is p(71.97, df = 1) = p 0. The adjusted p-value after Bonferroni correction is 10 5 p 0 = 10 5 p 0. Genotype based The p-value for the χ 2 test statistics is p(52.07, df = 2) = The adjusted p-value after Bonferroni correction is = (c) (5 points) Bonferroni correction is effective when all the statistical tests are independent of each other. Consider performing case/control genome wide association studies for type II diabetes based on African individuals. Consider performing the same type of study on European population. In general, African population is more ancient and African genomes have weaker linkage disequilibrium than European population. Would Bonferroni correction be more effective in African or in European population? Why? Since African genomes have weaker linkage disequilibrium than European population, each loci of African genomes are is more likely to be independent of each other. Thus the Bonferroni correction could be more effective in African population. 9

10 (d) (5 points) Given the following data, perform chi-square tests to test the association between a given locus and case/control status. Control Case Major allele homozygous heterozygous 1 2 Minor allele homozygous 1 2 The null hypothesis H 0 is that there is no association between a given locus and case/control status. Suppose the two alleles here are A (major) and a (minor). The total number of control samples is 107 and the total number of case samples is 104. We first calculate the allele frequency under the null hypothesis. The total number of major allele homozygous, heterozygous and minor allele homozygous are 205, 3 and 3 correspondingly P (A) = 2 ( ) = P (a) = 2 ( ) = 0.02 Allele based The observed allele count table is as follows, Control Case Major allele (A) = = 202 Minor allel(a) = = 6 The expected allele count table is as follows, Control Case Major allele (A) = = Minor allel(a) = = 4.16 Then we calculate the χ 2 test statistics, χ 2 ( )2 = = 1.22 (6 4.16) By checking the χ 2 distribution table, we could find χ ,df=1 = Since 1.22<3.84, the null hypothesis is not violated and there is not an association between a given locus and case/control status. Genotype based Control Case Major allele homozygous = = heterozygous = = 1.48 Minor allele homozygous = = 1.48 Then we calculate the χ 2 test statistics, χ 2 ( )2 = = (2 1.48)

11 By checking the χ 2 distribution table, we could find χ ,df=2 = Since 0.74<5.99, the null hypothesis is not violated and there is not an association between a given locus and case/control status. Allele+Genotype based Similarly we don t know whether Hardy-Weinberg Equilibrium holds for current generation or not. If you do the calculation, you could find you will draw a wrong conclusion. (e) (5 points) In (b), what is the minor allele frequency in the whole population including all samples? Can you reliably conclude on the significance of the association? Why? The minor allele frequency is 2 ( ) = 0.51 in (b). The allele frequency is fairly large which makes the significance of the association reliable. But in (d), the minor allele frequency is 2 ( ) = The sample size containing minor alleles is too small for the association study in (d), so the significance of the association is not so reliable. 6. [7 points] Haplotypes and Genome-wide Association Studies Consider the genome data below collected from case (patient) and control (normal healthy) individuals. Our goal is to see if the haplotypes formed by the three SNPs influence the disease susceptibility. Case: Individual 1...C...T..G....C...T..G. Individual 2...T...G..A....C...T..G. Individual 3...C...T..A....C...T..G. Control Individual 4...T...G..A....T...G..A. Individual 5...C...T..A....C...T..A. (a) (2 points) List haplotype alleles. Haplotype alleles are CTG, TGA and CTA. (b) (5 points) Create a contingency table that you can use for chi square test. The contingency table is as follows, Case Control Total CTG TGA CTA Total [10 points] Structural Variants Assume you are performing paired-end sequencing of a region of your own genome to see if it contains an insertion or deletion compared to the reference genome. Assume the distribution of bp distances between the two sequenced fragments (or insert sizes) in each mate pair (collected genome-wide) is given as in the lecture note. 11

12 (a) (5 points) If there was a homozygous insertion of length 100bp in your genome, what would be the distribution of the distances between the two sequenced fragments in each mate pair from your own genome? Suppose the mean value of the real distribution is 400, 0.02 group density Measured Distribution Real Distribution distance (b) (5 points) If there was a heterozygous insertion of length 100bp in your genome, what would be the distribution of the distances between the two sequenced fragments in each mate pair from your own genome? Suppose the mean value of the real distribution is 400, 12

13 0.02 group density Measured Distribution Real Distribution distance 13

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam Banerjee November 26, 2007 Genetic Polymorphism Single nucleotide polymorphism (SNP) Genetic Polymorphism

More information

Linkage and Linkage Disequilibrium

Linkage and Linkage Disequilibrium Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda 1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;

More information

Introduction to Linkage Disequilibrium

Introduction to Linkage Disequilibrium Introduction to September 10, 2014 Suppose we have two genes on a single chromosome gene A and gene B such that each gene has only two alleles Aalleles : A 1 and A 2 Balleles : B 1 and B 2 Suppose we have

More information

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015 Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2015 1 / 1 Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits.

More information

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M. STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

LECTURE # How does one test whether a population is in the HW equilibrium? (i) try the following example: Genotype Observed AA 50 Aa 0 aa 50

LECTURE # How does one test whether a population is in the HW equilibrium? (i) try the following example: Genotype Observed AA 50 Aa 0 aa 50 LECTURE #10 A. The Hardy-Weinberg Equilibrium 1. From the definitions of p and q, and of p 2, 2pq, and q 2, an equilibrium is indicated (p + q) 2 = p 2 + 2pq + q 2 : if p and q remain constant, and if

More information

Methods for Cryptic Structure. Methods for Cryptic Structure

Methods for Cryptic Structure. Methods for Cryptic Structure Case-Control Association Testing Review Consider testing for association between a disease and a genetic marker Idea is to look for an association by comparing allele/genotype frequencies between the cases

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

Goodness of Fit Goodness of fit - 2 classes

Goodness of Fit Goodness of fit - 2 classes Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES: .5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the

More information

Introduction to Advanced Population Genetics

Introduction to Advanced Population Genetics Introduction to Advanced Population Genetics Learning Objectives Describe the basic model of human evolutionary history Describe the key evolutionary forces How demography can influence the site frequency

More information

Lecture 9. QTL Mapping 2: Outbred Populations

Lecture 9. QTL Mapping 2: Outbred Populations Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred

More information

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation? October 12, 2009 Bioe 109 Fall 2009 Lecture 8 Microevolution 1 - selection The Hardy-Weinberg-Castle Equilibrium - consider a single locus with two alleles A 1 and A 2. - three genotypes are thus possible:

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important? Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 3 Exercise 3.. a. Define random mating. b. Discuss what random mating as defined in (a) above means in a single infinite population

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

(Genome-wide) association analysis

(Genome-wide) association analysis (Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by

More information

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Overview - 1 Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Elizabeth Thompson University of Washington Seattle, WA, USA MWF 8:30-9:20; THO 211 Web page: www.stat.washington.edu/ thompson/stat550/

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Notes on Population Genetics

Notes on Population Genetics Notes on Population Genetics Graham Coop 1 1 Department of Evolution and Ecology & Center for Population Biology, University of California, Davis. To whom correspondence should be addressed: gmcoop@ucdavis.edu

More information

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8 The E-M Algorithm in Genetics Biostatistics 666 Lecture 8 Maximum Likelihood Estimation of Allele Frequencies Find parameter estimates which make observed data most likely General approach, as long as

More information

Genotype Imputation. Class Discussion for January 19, 2016

Genotype Imputation. Class Discussion for January 19, 2016 Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#5:(Mar-21-2010) Genome Wide Association Studies 1 Experiments on Garden Peas Statistical Significance 2 The law of causality...

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

Nature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs.

Nature Genetics: doi: /ng Supplementary Figure 1. Number of cases and proxy cases required to detect association at designs. Supplementary Figure 1 Number of cases and proxy cases required to detect association at designs. = 5 10 8 for case control and proxy case control The ratio of controls to cases (or proxy cases) is 1.

More information

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies.

Theoretical and computational aspects of association tests: application in case-control genome-wide association studies. Theoretical and computational aspects of association tests: application in case-control genome-wide association studies Mathieu Emily November 18, 2014 Caen mathieu.emily@agrocampus-ouest.fr - Agrocampus

More information

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics: Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships

More information

Learning ancestral genetic processes using nonparametric Bayesian models

Learning ancestral genetic processes using nonparametric Bayesian models Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew

More information

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,

More information

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y

More information

POPULATION STRUCTURE 82

POPULATION STRUCTURE 82 POPULATION STRUCTURE 82 Human Populations: History and Structure In the paper Novembre J, Johnson, Bryc K, Kutalik Z, Boyko AR, Auton A, Indap A, King KS, Bergmann A, Nelson MB, Stephens M, Bustamante

More information

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele

More information

The genomes of recombinant inbred lines

The genomes of recombinant inbred lines The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)

More information

Breeding Values and Inbreeding. Breeding Values and Inbreeding

Breeding Values and Inbreeding. Breeding Values and Inbreeding Breeding Values and Inbreeding Genotypic Values For the bi-allelic single locus case, we previously defined the mean genotypic (or equivalently the mean phenotypic values) to be a if genotype is A 2 A

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA

Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Effect of Genetic Divergence in Identifying Ancestral Origin using HAPAA Andreas Sundquist*, Eugene Fratkin*, Chuong B. Do, Serafim Batzoglou Department of Computer Science, Stanford University, Stanford,

More information

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white Outline - segregation of alleles in single trait crosses - independent assortment of alleles - using probability to predict outcomes - statistical analysis of hypotheses - conditional probability in multi-generation

More information

Population Genetics I. Bio

Population Genetics I. Bio Population Genetics I. Bio5488-2018 Don Conrad dconrad@genetics.wustl.edu Why study population genetics? Functional Inference Demographic inference: History of mankind is written in our DNA. We can learn

More information

Genetic Association Studies in the Presence of Population Structure and Admixture

Genetic Association Studies in the Presence of Population Structure and Admixture Genetic Association Studies in the Presence of Population Structure and Admixture Purushottam W. Laud and Nicholas M. Pajewski Division of Biostatistics Department of Population Health Medical College

More information

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity, AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity, Today: Review Probability in Populatin Genetics Review basic statistics Population Definition

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 18: Introduction to covariates, the QQ plot, and population structure II + minimal GWAS steps Jason Mezey jgm45@cornell.edu April

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) 12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²

More information

Analysis of DNA variations in GSTA and GSTM gene clusters based on the results of genome-wide data from three Russian populations taken as an example

Analysis of DNA variations in GSTA and GSTM gene clusters based on the results of genome-wide data from three Russian populations taken as an example Filippova et al. BMC Genetics 2012, 13:89 RESEARCH ARTICLE Open Access Analysis of DNA variations in GSTA and GSTM gene clusters based on the results of genome-wide data from three Russian populations

More information

How to analyze many contingency tables simultaneously?

How to analyze many contingency tables simultaneously? How to analyze many contingency tables simultaneously? Thorsten Dickhaus Humboldt-Universität zu Berlin Beuth Hochschule für Technik Berlin, 31.10.2012 Outline Motivation: Genetic association studies Statistical

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:

More information

SNP Association Studies with Case-Parent Trios

SNP Association Studies with Case-Parent Trios SNP Association Studies with Case-Parent Trios Department of Biostatistics Johns Hopkins Bloomberg School of Public Health September 3, 2009 Population-based Association Studies Balding (2006). Nature

More information

COMBI - Combining high-dimensional classification and multiple hypotheses testing for the analysis of big data in genetics

COMBI - Combining high-dimensional classification and multiple hypotheses testing for the analysis of big data in genetics COMBI - Combining high-dimensional classification and multiple hypotheses testing for the analysis of big data in genetics Thorsten Dickhaus University of Bremen Institute for Statistics AG DANK Herbsttagung

More information

Populations in statistical genetics

Populations in statistical genetics Populations in statistical genetics What are they, and how can we infer them from whole genome data? Daniel Lawson Heilbronn Institute, University of Bristol www.paintmychromosomes.com Work with: January

More information

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase

Humans have two copies of each chromosome. Inherited from mother and father. Genotyping technologies do not maintain the phase Humans have two copies of each chromosome Inherited from mother and father. Genotyping technologies do not maintain the phase Genotyping technologies do not maintain the phase Recall that proximal SNPs

More information

Notes for MCTP Week 2, 2014

Notes for MCTP Week 2, 2014 Notes for MCTP Week 2, 2014 Lecture 1: Biological background Evolutionary biology and population genetics are highly interdisciplinary areas of research, with many contributions being made from mathematics,

More information

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017 I. χ 2 or chi-square test Objectives: Compare how close an experimentally derived value agrees with an expected value. One method to

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

Weierstraß-Institut. für Angewandte Analysis und Stochastik. Leibniz-Institut im Forschungsverbund Berlin e. V. Preprint ISSN

Weierstraß-Institut. für Angewandte Analysis und Stochastik. Leibniz-Institut im Forschungsverbund Berlin e. V. Preprint ISSN Weierstraß-Institut für Angewandte Analysis und Stochastik Leibniz-Institut im Forschungsverbund Berlin e. V. Preprint ISSN 2198-5855 On an extended interpretation of linkage disequilibrium in genetic

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Introduction to Natural Selection. Ryan Hernandez Tim O Connor

Introduction to Natural Selection. Ryan Hernandez Tim O Connor Introduction to Natural Selection Ryan Hernandez Tim O Connor 1 Goals Learn about the population genetics of natural selection How to write a simple simulation with natural selection 2 Basic Biology genome

More information

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions. A common problem of this type is concerned with determining

More information

Figure S1: The model underlying our inference of the age of ancient genomes

Figure S1: The model underlying our inference of the age of ancient genomes A genetic method for dating ancient genomes provides a direct estimate of human generation interval in the last 45,000 years Priya Moorjani, Sriram Sankararaman, Qiaomei Fu, Molly Przeworski, Nick Patterson,

More information

Classical Selection, Balancing Selection, and Neutral Mutations

Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Problems for 3505 (2011)

Problems for 3505 (2011) Problems for 505 (2011) 1. In the simplex of genotype distributions x + y + z = 1, for two alleles, the Hardy- Weinberg distributions x = p 2, y = 2pq, z = q 2 (p + q = 1) are characterized by y 2 = 4xz.

More information

Major Genes, Polygenes, and

Major Genes, Polygenes, and Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative

More information

Solutions to Problem Set 4

Solutions to Problem Set 4 Question 1 Solutions to 7.014 Problem Set 4 Because you have not read much scientific literature, you decide to study the genetics of garden peas. You have two pure breeding pea strains. One that is tall

More information

An Efficient and Accurate Graph-Based Approach to Detect Population Substructure

An Efficient and Accurate Graph-Based Approach to Detect Population Substructure An Efficient and Accurate Graph-Based Approach to Detect Population Substructure Srinath Sridhar, Satish Rao and Eran Halperin Abstract. Currently, large-scale projects are underway to perform whole genome

More information

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies

Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies Introduction to Statistical Genetics (BST227) Lecture 6: Population Substructure in Association Studies Confounding in gene+c associa+on studies q What is it? q What is the effect? q How to detect it?

More information

Parts 2. Modeling chromosome segregation

Parts 2. Modeling chromosome segregation Genome 371, Autumn 2017 Quiz Section 2 Meiosis Goals: To increase your familiarity with the molecular control of meiosis, outcomes of meiosis, and the important role of crossing over in generating genetic

More information

(Write your name on every page. One point will be deducted for every page without your name!)

(Write your name on every page. One point will be deducted for every page without your name!) POPULATION GENETICS AND MICROEVOLUTIONARY THEORY FINAL EXAMINATION (Write your name on every page. One point will be deducted for every page without your name!) 1. Briefly define (5 points each): a) Average

More information

Lecture 13: Population Structure. October 8, 2012

Lecture 13: Population Structure. October 8, 2012 Lecture 13: Population Structure October 8, 2012 Last Time Effective population size calculations Historical importance of drift: shifting balance or noise? Population structure Today Course feedback The

More information

A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation

A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation Ann. Hum. Genet., Lond. (1975), 39, 141 Printed in Great Britain 141 A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation BY CHARLES F. SING AND EDWARD D.

More information

8. Genetic Diversity

8. Genetic Diversity 8. Genetic Diversity Many ways to measure the diversity of a population: For any measure of diversity, we expect an estimate to be: when only one kind of object is present; low when >1 kind of objects

More information

Lab 12. Linkage Disequilibrium. November 28, 2012

Lab 12. Linkage Disequilibrium. November 28, 2012 Lab 12. Linkage Disequilibrium November 28, 2012 Goals 1. Es

More information

Backward Genotype-Trait Association. in Case-Control Designs

Backward Genotype-Trait Association. in Case-Control Designs Backward Genotype-Trait Association (BGTA)-Based Dissection of Complex Traits in Case-Control Designs Tian Zheng, Hui Wang and Shaw-Hwa Lo Department of Statistics, Columbia University, New York, New York,

More information

Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II)

Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) 1/45 Introduction to Analysis of Genomic Data Using R Lecture 6: Review Statistics (Part II) Dr. Yen-Yi Ho (hoyen@stat.sc.edu) Feb 9, 2018 2/45 Objectives of Lecture 6 Association between Variables Goodness

More information

Asymptotic distribution of the largest eigenvalue with application to genetic data

Asymptotic distribution of the largest eigenvalue with application to genetic data Asymptotic distribution of the largest eigenvalue with application to genetic data Chong Wu University of Minnesota September 30, 2016 T32 Journal Club Chong Wu 1 / 25 Table of Contents 1 Background Gene-gene

More information

EM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works

EM algorithm. Rather than jumping into the details of the particular EM algorithm, we ll look at a simpler example to get the idea of how it works EM algorithm The example in the book for doing the EM algorithm is rather difficult, and was not available in software at the time that the authors wrote the book, but they implemented a SAS macro to implement

More information

Mathematical models in population genetics II

Mathematical models in population genetics II Mathematical models in population genetics II Anand Bhaskar Evolutionary Biology and Theory of Computing Bootcamp January 1, 014 Quick recap Large discrete-time randomly mating Wright-Fisher population

More information

STT 843 Key to Homework 1 Spring 2018

STT 843 Key to Homework 1 Spring 2018 STT 843 Key to Homework Spring 208 Due date: Feb 4, 208 42 (a Because σ = 2, σ 22 = and ρ 2 = 05, we have σ 2 = ρ 2 σ σ22 = 2/2 Then, the mean and covariance of the bivariate normal is µ = ( 0 2 and Σ

More information

Population genetics snippets for genepop

Population genetics snippets for genepop Population genetics snippets for genepop Peter Beerli August 0, 205 Contents 0.Basics 0.2Exact test 2 0.Fixation indices 4 0.4Isolation by Distance 5 0.5Further Reading 8 0.6References 8 0.7Disclaimer

More information

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency Bruce Walsh lecture notes Introduction to Quantitative Genetics SISG, Seattle 16 18 July 2018 1 Outline Genetics of complex

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 007 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

F SR = (H R H S)/H R. Frequency of A Frequency of a Population Population

F SR = (H R H S)/H R. Frequency of A Frequency of a Population Population Hierarchical structure, F-statistics, Wahlund effect, Inbreeding, Inbreeding coefficient Genetic difference: the difference of allele frequencies among the subpopulations Hierarchical population structure

More information

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series: Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 7 Exercise 7.. Derive the two scales of relation for each of the two following recurrent series: u: 0, 8, 6, 48, 46,L 36 7

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 009 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

Supporting Information

Supporting Information Supporting Information Hammer et al. 10.1073/pnas.1109300108 SI Materials and Methods Two-Population Model. Estimating demographic parameters. For each pair of sub-saharan African populations we consider

More information

Parts 2. Modeling chromosome segregation

Parts 2. Modeling chromosome segregation Genome 371, Autumn 2018 Quiz Section 2 Meiosis Goals: To increase your familiarity with the molecular control of meiosis, outcomes of meiosis, and the important role of crossing over in generating genetic

More information

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB

Quantitative Genomics and Genetics BTRY 4830/6830; PBSB Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary

More information

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies

Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Probability of Detecting Disease-Associated SNPs in Case-Control Genome-Wide Association Studies Ruth Pfeiffer, Ph.D. Mitchell Gail Biostatistics Branch Division of Cancer Epidemiology&Genetics National

More information

Case Studies in Ecology and Evolution

Case Studies in Ecology and Evolution 3 Non-random mating, Inbreeding and Population Structure. Jewelweed, Impatiens capensis, is a common woodland flower in the Eastern US. You may have seen the swollen seed pods that explosively pop when

More information