For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

Similar documents
1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Linkage and Linkage Disequilibrium

2. Map genetic distance between markers

theta H H H H H H H H H H H K K K K K K K K K K centimorgans

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

Lecture 9. QTL Mapping 2: Outbred Populations

Affected Sibling Pairs. Biostatistics 666

Solutions to Problem Set 4

6.6 Meiosis and Genetic Variation. KEY CONCEPT Independent assortment and crossing over during meiosis result in genetic diversity.

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Calculation of IBD probabilities

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Linkage Mapping. Reading: Mather K (1951) The measurement of linkage in heredity. 2nd Ed. John Wiley and Sons, New York. Chapters 5 and 6.

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Linkage and Chromosome Mapping

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

1. Understand the methods for analyzing population structure in genomes

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th

Objectives. Announcements. Comparison of mitosis and meiosis

BS 50 Genetics and Genomics Week of Oct 3 Additional Practice Problems for Section. A/a ; B/B ; d/d X A/a ; b/b ; D/d

Calculation of IBD probabilities

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Linkage and Chromosome Mapping

Mapping QTL to a phylogenetic tree

Prediction of the Confidence Interval of Quantitative Trait Loci Location

Case-Control Association Testing. Case-Control Association Testing

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

Linkage analysis and QTL mapping in autotetraploid species. Christine Hackett Biomathematics and Statistics Scotland Dundee DD2 5DA

Use of hidden Markov models for QTL mapping

Testing for Homogeneity in Genetic Linkage Analysis

Stationary Distribution of the Linkage Disequilibrium Coefficient r 2

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Lecture 28 Chi-Square Analysis

Genetics (patterns of inheritance)

The Quantitative TDT

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white

Legend: S spotted Genotypes: P1 SS & ss F1 Ss ss plain F2 (with ratio) 1SS :2 WSs: 1ss. Legend W white White bull 1 Ww red cows ww ww red

STAT 536: Genetic Statistics

Parts 2. Modeling chromosome segregation

Lecture 6. QTL Mapping

Statistical issues in QTL mapping in mice

The Lander-Green Algorithm. Biostatistics 666 Lecture 22

Introduction to Linkage Disequilibrium

Introduction to QTL mapping in model organisms

Variance Component Models for Quantitative Traits. Biostatistics 666

Overview. Background

Introduction to Genetics

Parts 2. Modeling chromosome segregation

Unit 3 Test 2 Study Guide

Outline for today s lecture (Ch. 14, Part I)

Heredity and Genetics WKSH

The universal validity of the possible triangle constraint for Affected-Sib-Pairs

Evolutionary Genetics Midterm 2008

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

Learning gene regulatory networks Statistical methods for haplotype inference Part I

MCMC IN THE ANALYSIS OF GENETIC DATA ON PEDIGREES

1 Errors in mitosis and meiosis can result in chromosomal abnormalities.

Observing Patterns in Inherited Traits

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Chapter 10 Sexual Reproduction and Genetics

p(d g A,g B )p(g B ), g B

Problems for 3505 (2011)

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms

When one gene is wild type and the other mutant:

Advance Organizer. Topic: Mendelian Genetics and Meiosis

Mathematical models in population genetics II

On Computation of P-values in Parametric Linkage Analysis

Introduction to Genetics

SNP-SNP Interactions in Case-Parent Trios

Lesson 4: Understanding Genetics

Meiosis -> Inheritance. How do the events of Meiosis predict patterns of heritable variation?

Evolution of phenotypic traits

Runaway. demogenetic model for sexual selection. Louise Chevalier. Jacques Labonne

STAT 536: Genetic Statistics

Unit 7 Genetics. Meiosis

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Family resemblance can be striking!

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Section 11 1 The Work of Gregor Mendel

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Population Genetics II (Selection + Haplotype analyses)

Keys to Success on the Quarter 3 EXAM

Ch 11.Introduction to Genetics.Biology.Landis

Discrete Multivariate Statistics

Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics

Advanced Algorithms and Models for Computational Biology -- a machine learning approach

Linear Regression (1/1/17)

Phasing via the Expectation Maximization (EM) Algorithm

Introduction to Genetics

Ch 11.4, 11.5, and 14.1 Review. Game

Chapter 13 Meiosis and Sexual Reproduction

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Rebops. Your Rebop Traits Alternative forms. Procedure (work in pairs):

A. Correct! Genetically a female is XX, and has 22 pairs of autosomes.

Transcription:

STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124). There were three population samples (all from around Seattle), (Caucasian, African American, and Japanese American), and three tightly linked diallelic loci, designated M, P and S. In the Japanese subsample, the 68 individuals (136 chromosomes) were scored from all three loci. As in the Homework-4 question, there were few double-heterozygotes for any pair of the loci. So you can treat the estimated sample haplotype frequencies as though they were observed. (a) For loci S and M, the estimated haplotype frequencies are 0.551, 0.082, 0.008, 0.360. Is there evidence for disequilibrium between loci S and M? In 2x2 contingency table (q χ 2 11 q 22 q 12 q 21 ) 2 = n (q 11 + q 12 )(q 11 + q 21 )(q 12 + q 22 )(q 21 + q 22 ) = 92.36 For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M. (b) For loci P and M, the estimated haplotype frequencies are 0.514, 0.398, 0.045, 0.043. Is there evidence for disequilibrium between loci P and M? By the same formula χ 2 = 0.12 It is too small to infer disequilibrium. (c) For loci S and P, the estimated haplotype frequencies are 0.592, 0.041, 0.320 0.047. Is there evidence for disequilibrium between loci S and P? χ 2 = 1.57 Again, it is not enough to infer disequlibrium.

2. In a simple backcross experiment between two inbred lines, hybrid AB/ab individuals are crossed with ab/ab individuals, and the numbers of recombinant offspring are counted. Among a total of 120 offspring in which the hybrid individual was female, 50 were of the recombinant types. Among 90 offspring in which the hybrid individual was male, 38 were of the recombinant types. (a) Is there evidence for linkage, using only the data on the offspring of hybrid males? For males ρ has MLE of 38/90=0.4222, and its base-10 lod score is lod ρ = t log ρ + n t log 1 ρ + n log 2 = 0.4748 This score is too low to infer linkage (b) Is there evidence for linkage, using only the data on the offspring of hybrid females? For females ρ has MLE of 50/120=0.4166, and its base-10 lod score is lod ρ = t log ρ + n t log 1 ρ + n log 2 = 0.7272 This score is too low to infer linkage (c) Assuming male and female recombination frequencies are the same, is there evidence for linkage? For both ρ has MLE of 88/210=0.4190, and its base-10 lod score is lod ρ = t log ρ + n t log 1 ρ + n log 2 = 1.2006 This score is still too low to infer linkage (d) Is there evidence from this experiment that male and female recombination frequencies differ? There is almost no difference in MLE for ρ between males and females, so nothing suggests that recombination frequencies differ. Difference in lod scores is explained by the fact that there is more data on females than on males.

3. A couple, each of whom is unaffected, have two kids affected by a recessive disease. At a linked marker, the parents have four distinct alleles. One parent is ab, and the other is cd. Both the affected kids are bd at the marker, and the recombination probability between trait and marker is r. A fetus is typed as bc. Show that the probability the baby will be affected is r(1-r)((1-r) 4 + r(1-r) R + r 4 )/R 2 where R = + (1-r) 2. Both children should be either recombinant or non recombinant is meioses from both parents. So the probability that they are both recombinant in meiosis To be affected, for meioses from parent ab fetus need also to be recombinant if children are recombinant and non-recombinant otherwise, probability of that event is r + 1 + 1 1 r = r3 + 1 r 3 For meioses from parent cd fetus need also to be non-recombinant if children are recombinant and recombinant otherwise, probability is + 1 (1 r) + 1 + 1 r = r(1 r) So the probability that fetus is affected is the product of those probabilities (as events are independent) and is r 3 + 1 r 3 We can see that r(1 r) = r 1 r r3 + 1 r 3 + 1 2 = r 1 r r3 + 1 r 3 R 2 (1 r) 4 + r 1 r R + r 4 = (1 r) 4 + r 3 1 r + r 1 r 3 + r 4 = r 3 1 r + r + 1 r 3 1 r + r = r 3 + 1 r 3 Therefore probability that fetus is affected can also be written as r 1 r (1 r)4 + r 1 r R + r 4 R 2

4. Suppose we sample families with two offspring, in which one parent is heterozygous for a dominant genetic disease allele, and marker type ab, while the unaffected spouse has marker type aa. The familes are divided into two groups: Group 1: the two offspring share the same disease status, or the same marker type, but not both Group II: all other combinations. Suppose the recombination frequency between the disease locus and the marker locus is r. (a) Show that the probability a given family is of Group-I type is s = 2 r (1-r). If both children have the same marker (probability ½) to have different disease status one child needs to recombinant in meiosis from the affected parent, and another one non-recombinant (probability 2r(1-r)). If they have different markers (probability ½) to have the same disease status again one child has to be recombinant and another one -non-recombinant. So the total probability is 1 2 2r 1 r + 1 2r 1 r = 2r(1 r) 2 (b) Suppose n families are sampled, and k are of Group-I type. Show that the log-likelihood is, up to an additive constant, k log s + (n-k) log (1-s) or k log(2 r (1-r )) + (n-k) log( 1-2 r (1-r )) Probabilty of such event is proportional to 2r 1 r k Therefore logarithm of probability, up to a constant is log 2r 1 r k 1 2r 1 r n k 1 2r 1 r n k = k log 2r 1 r + n k log 1 2r 1 r = k log s + n k log(1 s)

(c) Show that, provided k is not bigger than n/2 the maximum likelihood estimate of r is (1 - sqrt(1-2k/n))/2. It easy to see that s=k/n maximizes k log s + n k log(1 s): at that point d ds k log s + n k log(1 s) = k s n k 1 s = n n = 0 d 2 dx 2 k log s + n k log(1 s) = k s 2 n k (1 s) 2 < 0 So the log likelihood is optimized if s = 2r 1 r = k n or r + k 2n = 0 r = 1 2 1 ± 1 2k n Since recombination probability is always <1/2, the only possible value is r = 1 2 1 1 2k n (d) It can be shown that the Fisher information for estimating r is ( 2 n (1-2 r) 2 )/ (r (1-r) (1-2 r (1-r)) ) You do not need to show this. What does this tell you about estimating r, when in fact r is close to 1/2? As r approaches ½ Fisher information approaches 0, therefore it becomes more difficult to estimate r, i.e. more experimental data (larger n) is required to make estimation of the same accuracy (or to infer linkage).