Linkage Mapping. Reading: Mather K (1951) The measurement of linkage in heredity. 2nd Ed. John Wiley and Sons, New York. Chapters 5 and 6.

Size: px
Start display at page:

Download "Linkage Mapping. Reading: Mather K (1951) The measurement of linkage in heredity. 2nd Ed. John Wiley and Sons, New York. Chapters 5 and 6."

Transcription

1 Linkage Mapping Reading: Mather K (1951) The measurement of linkage in heredity. 2nd Ed. John Wiley and Sons, New York. Chapters 5 and 6. Genetic maps The relative positions of genes on a chromosome can be determined by conducting linkage analysis. As an example, Hedges et al. (1990) crossed the soybean line Peking, which is homozygous for dominant alleles at genes Rj1 (restricts nodulation with certain strains of Bradyrhizobium) and F (confers the fasciated stem and leaf phenotype), to BARC-1, which is homozygous recessive for both genes. The two lines also differed for alleles at the codominant isozyme locus Idh1 (isocitrate dehydrogenase). The triply heterozygous F 1 plant was selfed to form an F 2 population of 250 plants, each of which was scored for the three phenotypes. Development of a genetic map from this type of experiment follows three phases: first, linkage is detected or loci are declared unlinked; second, recombination frequencies between each pair of loci are estimated; and third, the loci are ordered into a linear map. Detection of linkage Are two genes linked? Deviations from expected segregation ratios derived assuming no linkage are evidence for linkage. But in any sampled population, we know that there will be some deviation from expectation simply due to random chance. How do we tell the difference between random deviations from expected segregation ratios and deviations due to linkage? The chi-square (χ 2 ) test provides a statistical basis for making this determination. We know the expected segregation ratio under the null hypothesis of no linkage between two genes, and for an observed segregation ratio, we can compute the probability of observing the data if the null hypothesis were really true. Generally, we consider data that would be observed with 5% or lower probability under the null hypothesis as evidence that the null hypothesis is false. The data collected by Hedges et al. (1990) on the segregation of the F and Rj1 genes are given in Table 1. For now, we ignore the Idh1 gene. Table 1. Observed phenotypic segregation in an F 2 population developed from the soybean cross Peking (FFRj1Rj1) BARC-1 (ffrj1rj1). Phenotypic class Number Observed F_Rj1_ 144 F_rj1rj1 44 ffrj1-39 ffrj1rj1 23 Total 250 1

2 First, we check that the expected single-gene ratios were observed. For a single dominant gene in an F 2 population, what is the expected phenotypic segregation ratio? Pooling data for F over the two R classes, we find 188 F_ and 62 ff phenotypes. The expected numbers based on the null hypothesis are and 62.5; so the observed data are as close to expectation as possible, we won t bother performing a statistical test. Pooling data for R over the two F classes, we find 183 R_ and 67 rr phenotypes. To perform the χ 2 test, we compute the difference between expected and observed numbers in each class, square the difference and divide by the expected number, then sum over the two classes to obtain the χ 2 statistic (Table 2). Table 2. Chi-square test for single-gene segregation of Rj1 phenotype. Phenotypic class Number Observed Number Expected (E-O) 2 /E Rj1_ rj1rj Total In this example, the χ 2 statistic is This statistic has one degree of freedom because there are two classes, so if a plant is not in one class it has freedom to be in only one other class. Looking up the threshold χ 2 value for one d.f. (e.g., Appendix A.5 in Steel and Torrie), we find that a value of 3.84 or greater will occur 5% of the time or less if the null hypothesis is true. If the null hypothesis is true, then the we expect to observe a value of or greater about 50% of the time. Thus, there is no evidence that the null hypothesis is wrong; the Rj1 phenotype segregates as a single gene. Now that we are certain the Rj1 and F phneotypes are each segregating as single genes, we can ask if they are linked. Under the null hypothesis of no linkage, what is the expected segregation ratio for two dominant genes in an F2 population? We compute the χ 2 statistic based on this expectation (Table 3.) Table 3. Chi-square test for linkage of F and Rj1 genes. Phenotypic class Number Observed Number Expected (E-O) 2 /E F_Rj1_ F_rj1rj ffrj1_ ffrj1rj Total What is the threshold χ 2 value for the test of no linkage? It depends on the degrees of freedom associated with the test. This is tricky: the degrees of freedom for the overall deviation from the 9:3:3:1 segregation ratio has d.f. = 3 because there are four classes. But notice that deviations from this expectation could occur for three different reasons: deviation from 3:1 segregation at the F locus, deviation from 3:1 segregation at the Rj1 2

3 locus, or deviation from independent segregation between the two genes. The contribution of each source of variation to the total χ 2 value is additive. Thus, we can obtain the single degree of freedom χ 2 value for deviation due to linkage alone by subtracting the two χ 2 values for single-locus deviations from the overall χ 2 value: = (0.005 being the χ 2 value for deviations at the F locus). Notice that a value with 3 d.f. minus two values with one d.f. each leaves a value with one d.f. So we use the threshold χ 2 value for one d.f., which is Since the value for linkage is greater than the threshold value, we interpret this to mean that if the two genes were really unlinked, we would observe such a result 5% of the time or less. So, our conclusion is that the genes are not unlinked, thus they are linked. Similar χ 2 tests reveal that the Idh1 locus is linked to both the F and the Rj1 loci. Having determined that the genes are linked, we now must estimate the recombination frequency. Estimating recombination frequency The recombination frequency is the proportion of recombinant gametes divided by the total number of gametes. Consider the data in Tables 1 and 3 on the frequency of F Rj1 phenotypes: which classes represent recombinant gametes, and which represent parental gamete types? Obviously, the ffrj1rj1 class is generated by the union of two nonrecombinant gametes. But what about the F_rj1rj1 class? Obviously, one recombinant Frj1 gamete went into each plant in this class; but notice that these plants could be created from the union of two recombinant gametes (Frj1 + Frj1 = FFrj1rj1) or from the union of one recombinant gamete and one non-recombinant gamete (Frj1 + frj1 = Ffrj1rj1). Without performing progeny tests, we can t distinguish these possibilities, so we cannot simply count the number of recombinant gametes in an F 2 population, as we did in a backcross population. Being unable to count the number of recombinant gametes, how can we possibly estimate the recombination frequency? The most common method for estimating recombination frequency in this case is to use the method of maximum likelihood. This method is a general estimation procedure that is used in many areas of statistics, not only in linkage mapping. The basic idea is that, if we have a model that relates underlying factors to the observed data, then we can compute the probability of observing the data given a specific model. The probability of observing the data given the model is proportional to the likelihood of the model given the observed data. Thus, we can compute likelihoods for different models given the data and choose the model that is more likely as the better model. The idea is quite simple, but there is one major difficulty that is often encountered: there may be an infinite number of possible models, and without being able to compute likelihoods of all possible models, how can you be sure that you choose the most likely model (the maximum likelihood model)? For any particular application of the maximum likelihood method, therefore, batteries of statistical and mathematical tricks have been developed to make it possible to obtain the maximum likelihood estimate with some degree of assurance. For simple models, such as two point recombination frequency estimation, it is quite certain that one will obtain the maximum likelihood result using some simple tricks. For more complex procedures, including multi-point linkage mapping with many loci 3

4 simultaneously, it is often not guaranteed that one will obtain the maximum likelihood estimate. To illustrate the procedure, we will demonstrate the maximum likelihood procedure for two-point recombination frequency estimation between the F and Rj1 loci. First, we need a general model for how the data could be affected by recombination between the two loci. To do this, we need to compute the probabilities of obtaining the observed phenotypes given some degree of linkage. Starting with the doubly heterozygous F 1 plant, and knowing that the two genes are in coupling phase (the dominant alleles at both loci came from the same parent), we compute the probability of each possible two-locus gamete produced by the F 1 parent. There are two possibilities for recombination events: (1) recombination occurs between the two genes in the F 1, with a probability of r, the recombination frequency that we want to estimate; or (2) recombination does not occur, with probability 1 - r. If recombination does occur, then two types of gametes are produced, F - rj1 and f - Rj1. These occur with equal frequency, so the probability of an F-rj1 gamete is (½)r. And the probability of an f - Rj1 gamete is the same, (½)r. In the same way, the probabilities of the nonrecombinant gametes, F-Rj1 and f - rj1, are both (1/2)(1-r). Knowing the probabilities of the gametes for any arbitrary recombination frequency, r, allows us to compute the probabilities of the 9 possible F 2 genotypic classes, using a Punnet square (Table 4). Table 4. Punnet square used to determine probability of F 2 genotypes based on gamete probabilities from F 1. Male gametes Female gametes F-Rj1 (½)(1- r) F-rj1 (½)r f-rj1 (½)r f-rj1 (½)(1- r) F-Rj1 (½)(1- r) FFRj1Rj1 A FFRjrj1 A FfRj1Rj1 A FfRj1rj1 (1/4)(1- r) 2 (1/4)r(1- r) (1/4)r(1- r) (1/4)(1- r) 2 F-rj1 (½)r FFRjrj1 A FFrj1rj1 B FfRj1rj1 A Ffrj1rj1 B (1/4)r(1- r) (1/4)r 2 (1/4)r 2 (1/4)r(1- r) f-rj1 (½)r FfRj1Rj1 A FfRj1rj1 A ffrj1rj1 C ffrj1rj1 C (1/4)r(1- r) (1/4)r 2 (1/4)r 2 (1/4)r(1- r) f-rj1 (½)(1- r) FfRj1rj1 A Ffrj1rj1 B (1/4)(1- r) 2 (1/4)r(1- r) ffrj1rj1 (1/4)r(1- r) C ffrj1rj1 (1/4)(1- r) 2 Summing over all 16 boxes gives a total probability of one. Since we can only observe four phenotypic classes (A - D), we use the data from Table 4 to determine the probability of each of the four phenotypic classes. For example, the F_rj1rj1 phenotypic class is composed of genotypes FFrj1rj1 (which occurs with probability (1/4)r 2 ) and Ffrj1rj1 (which occurs with probability (1/4)r(1- r) + (1/4)r(1- r) = (½)r(1- r). Therefore, the probability of the F_rj1rj1 class is (1/4)r 2 + (½)r(1- r) = (1/4)r 2 + (½)r - (½)r 2 = (½)r - (1/4)r 2. The probabilities of the other four classes are worked out similarly (Table 5). D A 4

5 Table 5. Probabilities of four F 2 phenotypic classes based on two dominant genes. Phenotypic Class Genotypes Probability A (F_Rj1_) FFRj1Rj1, FfRj1Rj1, FFRj1rj1, FfRj1rj1 3/4 - (½)r + (1/4)r 2 B (F_rj1rj1) FFrj1rj1, Ffrj1rj1 (½)r - (1/4)r 2 C (ffrj1_) ffrj1rj1, ffrj1rj1 (½)r - (1/4)r 2 D (ffrj1rj1) ffrj1rj1 (1/4)(1 - r) 2 Total 1 If we observe one F 2 progeny from this cross, the probability of its being one of the four phenotypic classes is given in Table 5. What if we observe two F 2 progeny? What is the joint probability that both are phenotypic class A? That is the product of the probabilities of each one being phenotypic class A, because the progeny are independent. Similarly, the probability that one progeny is class A and the other is class B is the product of the probabilities that the first is A and the second is B plus the product of the probabilities that the first is B and the second is A. We can compute the joint probability for any sample of n total F 2 progeny with a, b, c, and d numbers in each progeny class using the multinomial probability expression: n! L = + + a! b! c! d! ( 2 ) ( 2 ) ( 2 3 ) ( 2 ) r r a r r b r r c r r d which is simply a special case of the general multinomial likelihood formula: n! a a a L = m m m t t a! a!... a! ( ) 1 ( ) 2...( ) t where n = total number of progeny observed, t = total number of progeny classes possible, a i = number of progeny in class i observed, and m i = expected proportion in class i. This formula can be used for any generation or mating scheme used for linkage analysis. You can see from the formula that n, a, b, c, and d are all observed quantities, and only r is unknown. By plugging in different values for r, one can compute the likelihood of observing the data for different values of r. So, one way to obtain a maximum likelihood estimate of recombination frequency is to compute the likelihood for a range of values of r between 0 and 0.5 and choose the value with the highest likelihood as the maximum likelihood estimate of recombination frequency. This is a numerical solution. Another way to obtain the maximum likelihood estimate is to consider the problem in terms of calculus. One can imagine plotting likelihood values on the Y-axis for recombination values on the X-axis, and obtaining a curve representing likelihood as a function of recombination frequency. Recall from calculus that the derivative of this function with respect to recombination frequency (r) represents the slope of the curve. The slope of the curve is zero where the curve peaks (or where the curve hits a nadir). 5

6 Thus, points at which the derivative is zero represent either maximum or minimum points of the likelihood curve. So, another way to find the maximum likelihood point without actually graphing the likelihood curve is to take the derivative of the likelihood function given above with respect to r, setting this derivative equal to zero and solving for values of r. Remember that if multiple solutions are obtained, they may be maxima or minima, in theory, but in practice, often there is only a single solution, or if there are multiple solutions, only one exists within the parameter space between 0 and 0.5. From this point on, it is simply a matter of mathematics to obtain the correct answer. In this example, an additional math trick is used to simplify things. Differentiation of the likelihood function itself is quite difficult because of the powers involved, so usually we take the natural log of the likelihood function and differentiate that instead. The resulting equation is easier to differentiate and we know from calculus that the maximum point of a function is also the maximum point of the log of the function. A further trick is made by substituting P for (1-r) 2, and taking the derivative with respect to P, solving for the maximum likelihood estimate of P, and then converting the estimate of r. The log of the likelihood function is: ln( L n! ) = ln( ) + aln 4 ( 2 + P) + bln 4 ( 1 P) + cln 4 ( 1 P) + d ln 4 P a! b! c! d! The derivative of the log function with respect to P is set equal to zero: ln( L) P = a b c d + = P 1 P 1 P P Multiplying both sides of the equation by the common denominator (2+P)(1-P)P gives: a( 1 P) P ( 2 + P) P( b + c) + ( 2 + P)( 1 P) d = a( P P ) ( 2P + P )( b + c) + ( 2 P P ) d = 0 2 2d + ( a 2b 2c d) P ( a + b + c + d) P = 0 Now, we fill in the numbers for a, b, c, and d from Hedges et al. (1990): 46 + P(-45) + P 2 (-250) = 0 This has the form of a quadratic equation, ax 2 + bx + c = 0, which has roots: b b ac x = ± 2 4 2a Using the quadratic equation, we solve for P = -0.52, P = (1- r) 2, so (1- r) = square root of P, which means that only is a real solution for P. (1- r) = r =

7 In some cases, one does not end up with a derivative of the log of the likelihood function that can be solved mathematically, as was possible in this case. In such a case, one must use numerical methods to obtain the correct solution. Nevertheless, it is still useful to get the derivative of the likelihood function because that leads to an estimate of the variance of the linkage estimate. The variance of the estimate requires taking the second derivative of the function, which we won t go into here (see Mather (1951) for details). The standard error of the recombination frequency estimate in this example is The important message is that these recombination frequency estimates are just that, estimates, and they are measured always with some error. Standard Errors of Recombination Frequencies and Information Factors that influence the precision of recombination frequency estimates include the sample size (bigger is better always), the gene action of the marker genes (more information is obtained with codominant marker genes than with dominant genes), and the type of population used (F 2 populations with codominance are better than backcross, but backcrosses are better than F 2 populations complete dominance). A way to compare the informativeness of different types of populations is to compare their information value per data point (one individual). This discussion follows Mather (1951). The total amount of information in a set of data is the inverse of the variance of the recombination frequency estimate from those data: I(r) = 1/V(r). The variance of the estimate is the variance of a single data point divided by the number of data points. Turning this around, the variance of a single data point is equal to the variance of the total data set divided by the number of data points. So, we can obtain the information per data point as a function of the variance of the estimator: I(r) = 1/V(r) = ni(r), where n = number of individuals sampled and i(r) = the information per individual: = = c 2 1 mi i( r), where c = total number of classes of progeny in the population. i 1 mi r The information for DH populations was derived as follows: Class Coupling Repulsion m i dm i /dr i m i dm i /dr i AABB (1/2)(1-r) -1/2 1/2(1-r) (1/2)r 1/2 (1/2)r AAbb (1/2)r 1/2 (1/2)r (1/2)(1-r) -1/2 1/2(1-r) aabb (1/2)r 1/2 (1/2)r (1/2)(1-r) -1/2 1/2(1-r) AABB (1/2)(1-r) -1/2 1/2(1-r) (1/2)r 1/2 (1/2)r Total 1 0 1/r(1-r) 1 0 1/r(1-r) 7

8 This is the same as for backcross populations (Mather 1951). (But note that in BC populations with dominant markers, only those loci for which the recurrent parent is recessive will be informative). The information values can be computed similarly for other types of populations, permitting a comparison of the relative informativeness (relative efficiency) of different population types per individual sampled (Figure 1 from Mather, 1951). I added relative efficiencies to this figure for recombinant inbred line (RIL) populations (equation from Liu et al. (1998)) and for doubled haploid (DH) populations. 8

9 Thus, F 2 populations with codominant markers are overall the best choice among these population types for mapping. However, if one only has dominant markers, F 2 populations have VERY poor precision for repulsion-phase linked markers. With dominant markers only, RILs give the best precision for loci linked at r < 0.15, and DH and BC populations give better precision for r > If you can develop them, DH populations are superior to BC populations in the sense that you can map all polymorphic markers in DH populations, but you can map only those polymorphic markers at which the recurrent parent is homozygous recessive in BC populations. Table 6 provides an example of how population type and population size influence the precision of recombination frequency estimates (Allard 1956). Table 6. Standard errors of recombination frequency estimates for different linkage intensities, population types, and population sizes (n). n = 100 n = 200 r = 0.05 r = 0.10 r = 0.20 r = 0.05 r = 0.10 r = 0.20 F2 dom. (coupling) F2 dom. (repulsion) F2 codominant Backcross = DH RIL LOD Scores An alternative statistic for linkage analysis that is used in combination with maximum likelihood methods is the LOD score. The LOD score is the logarithm of odds of a particular linkage arrangement and recombination frequency relative to the likelihood of the null hypothesis of no linkage (Lander and Botstein 1986): LOD = log 10 L( r$ MLE) L( r = 05. ) Using the data from the example of the F and Rj1 genes, the likelihood of the maximum likelihood estimate model, with a recombination frequency of 0.41 is obtained by simply plugging in r = 0.41 to the likelihood equation for the data set: 250! 144 ( ) 23 L( r$ = 041. ) = ( ) ( ) ( ) 144! 44! 39! 23! We won t bother to work this out because the first part with the factorials will cancel out in the LOD score equation. The likelihood of the model with no linkage is computed in the same way, but substituting 0.5 for r in the same equation. The LOD score is: 9

10 LOD = log = log 10 L( r$ = 0. 41) = log L( r = 05. ) ( + ) ( ) ( ) ( ) 144 ( ) 23 ( 9 / 16) ( 3/ 16) ( 1/ 16) = log ( ) = The odds ratio shows that the model of linkage of r = 0.41 is more than eight times more likely than the model of no linkage. In most applications in which many genes are mapped simultaneously, these two loci would probably be considered unlinked, because very stringent thresholds are normally used (e.g., linkage is often declared only when genes are linked with LOD of 3.0 or greater, meaning linkage is 1,000 times more likely than non-linkage). This is because when many genes are being mapped, the chance of declaring linkages falsely at least once in the entire analysis becomes quite large, so a high LOD threshold is used. At the same time, when many genes are being tested for linkage, it is likely that a gene between F and Rj1 would be mapped, and being closer to both of them than they are to each other, it would probably exhibit higher LOD scores with both F and Rj1, and the F and Rj1 genes would end up in the same linkage group anyway. The real reason the LOD score was developed was because human geneticists work with numerous pedigrees, each of which has only a few individuals. From any single family, there is not enough information to determine linkage, but by combining results across pedigrees, it is possible to detect linkage. A LOD score can be computed for each pedigree separately, then the combined LOD score for the whole data set is obtained by multiplying together the individual LOD scores. Plant breeders still need to understand LOD scores because the most commonly used software for linkage mapping is MAPMAKER, which was developed by human geneticists and uses LOD scores. LOD scores also provide a convenient way to compare how much more likely a model is than different alternative models. Ordering Loci Using the methods described above, Hedges et al. (1990), estimated recombination frequency between the three pairs of genes in soybean as follows: Gene Pair Recombination Frequency F - Rj ± 0.05 F - Idh ± 0.03 Rj1 - Idh ± 0.03 What is the order of the three loci? The most common sense ordering is F - Idh1 - Rj1. The linkage map for these three genes is drawn to summarize both the order and recombination distance information: 10

11 F Idh1 Rj Notice that there is something weird here: if this were really like a road map, the distance from F to Rj1 should be the sum of the distances from F to Idh1 and Idh1 to Rj1. But the recombination frequency between F and Rj1 is 0.41, which is less than the sum of 0.22 and Consider the effect of a recombination between F and Idh1 and a recombination between Idh1 and Rj1 in the same meiosis. The alleles at F and Idh1 will be recombined and the alleles at Idh1 and Rj1 will be recombined, but in those gametes, the parental combinations at the F and Rj1 genes will exist. So, when genes get very far apart, double-recombinations are more likely to occur between them. If the recombination frequency between F and Idh1 were 30% and that between Idh1 and Rj1 were 30%, the maximum recombination frequency between F and Rj1 would still be at most 50%. For this reason, mapping functions were developed to linearize maps, meaning that they make the distance between any two loci equal to the sum of the distances of the intervals between them. Mapping Functions Mapping functions are mathematical tools that are used to linearize genetic maps and to put genetic distances on a basis that relates to the number of crossovers that occur between genes, rather than on the basis of recombination. In order to develop a mapping function, one has to make assumptions about whether or not crossovers occur independently along a chromosome, or whether one crossover in a region tends to prevent a second crossover from occurring near it. This latter possibility is called chromosome interference, as we discussed in the previous lecture. We now return to the equation relating recombination frequency between loci A and C as a function of the recombination frequencies in the intervals A B and B C between these loci: r AC = r AB + r BC 2(1 - i)r AB r BC, where i = coefficient of coincidence. As we discussed in the previous lecture, if one assumes no interference, then i = 0 and we have: r AC = r AB + r BC 2r AB r BC. This is the basis for Haldane s mapping function, which we will derive. First, we define the average number of crossovers in an interval to be λ. We next define a map unit to be m = (1/2)λ. Map units are reported in Morgans, or more typically, centimorgans (cm). The idea is that an interval in which an average of one crossover occurs every meiosis is defined to have a genetic map distance 50 cm ( = 0.5 M, thus m = 0.5λ). 11

12 Haldane derived his mapping function by assuming that crossovers occur independently across the chromosome, so that the probability that a crossover occurring between any two points on a chromosome follows a Poisson distribution. Based on the Poisson distribution, the probability of no crossover between the loci defining the interval is: P[no crossover] = e -λ = e -2m. The probability of a crossover is: P[one or more crossovers] = 1 - e -2m. Each time one or more crossovers occurs in the interval, 50% recombinant gametes are produced (see last lecture), so the probability of a recombination in the interval is: r 2m 1 e =. 2 The inverse of this allows one to convert map distances into recombination frequencies: ln( 1 2r) m =. 2 Another commonly used mapping function is Kosambi s, which assumes that a moderate amount of interference occurs. Specifically, the Kosambi mapping function is derived under the assumption that the coefficient of coincidence is related to the size of the interval being mapped, C = 1 - i = 2r. This assumption implies that if the interval is large and r is near 0.5, then C approaches 1, i approaches 0, and interference is absent. But if the interval is small, and r is near 0.0 then C approaches 0, i approaches 1, and interference is nearly complete. With this assumption, the Kosambi mapping function is (see Liu, 1998 for derivation): 1 m = ln r. 1 2r The inverse of the Kosambi function is: 4m 1 ( e 1) r =. 4m 2 ( e + 1) The relationship between these two mapping functions and recombination frequency is shown in the figure below, from Ott (1999): 12

13 Notice that at small recombination frequencies (r < 0.10), there is little difference between recombination frequency itself and any of the mapping functions. This is because if recombination occurs at low frequency in an interval, than multiple crossovers must occur at very low frequency in such intervals. If the only reasonably likely events are zero of one crossover at any one meiosis, then the relationship between crossing over and recombination is linear! Thus, if one has a dense genetic map (with markers spaced at intervals no greater than, say, 10% recombination), there will be very little difference among maps created with different mapping functions. In reality, nobody knows what the precise relationship between recombination frequency and crossing-over is, and it probably varies among species, populations, and chromosomal regions, as we saw in the previous lecture (Sherman and Stack 1995). Furthermore, Sherman and Stack s (1995) result suggest that interference is not linearly related to interval length, so neither Kosambi nor Haldane s functions can be correct in all situations. Mapping functions are just a way to simplify the presentation of a linkage map. Multi-point ordering and mapping How does one order large numbers of loci in a common linkage group? As the number of loci increases, the number of possible orders increases geometrically. Worse, the twopoint recombination frequency estimates can not be used as the sole guide to locus ordering, because two-point estimates can lead to conflicting locus orders, particularly for 13

14 sets of genes that are closely linked. Remember that recombination frequencies are estimated with error! Multi-point maximum likelihood linkage mapping is most often used to order large numbers of loci and develop complete linkage maps. This method requires calculating the likelihood of different multiple locus linkage orders and distances, using the same method described above, but augmented by extending the multinomial probability function to include all possible multi-locus genotypes. Clearly, the computations quickly become unwieldy. Furthermore, if one has to attempt different locus orders and compute the maximum likelihood estimates of recombination frequencies and their probabilities for many possible orders, the approach is intractable without computer help. The advantage of multi-point linkage mapping is that it uses the available data most efficiently, which is particularly important when there is missing data at some genes on some individuals. Mapmaker/EXP is the most commonly used linkage mapping program because it can perform multi-point maximum likelihood linkage mapping conveniently. The details of multi-point linkage mapping can be found in Lander and Green (1987) and in the Mapmaker manual, and will be discussed only briefly here. The basic steps involved in making a linkage map are: 1. Assigning loci to linkage groups. The first step is to test each pair of loci for linkage, and if there is evidence for linkage (usually a fairly stringent statistical threshold is used), then the pair of loci is considered to be on a common linkage group. Establishment of linkage groups simplifies the following steps because different locus orders and map distances can be estimated and compared for one linkage group at a time, reducing the number of loci involved in the calculations for any one linkage group. 2. Choosing the order of loci. Various methods have been developed to choose the most likely order of loci in a linkage group (Liu 1998). Mapmaker/EXP chooses orders (when there are many loci) by first selecting a subset of five loci that are well spaced along the linkage group, and then by adding additional loci one at a time to this initial order. 3. Each time a locus is added to the group, it is placed in each possible interval and the likelihood for its position within each interval is computed and the most likely position is selected. Once the locus is tried in an interval, the locus order is taken to be known and the multipoint likelihood is computed for some starting values of recombination frequencies. An iterative procedure is used to find the most likely recombination frequencies given the chosen order and this yields the likelihood of the maximum likelihood estimate of all of the recombination frequencies on the linkage group simultaneously. Actually computing the likelihood for a given locus order and set of recombination frequencies is quite complicated see Liu (1998) and Lander and Green (1987) for details. 4. The best likelihoods of each order of markers are then compared and the most likely order is chosen. 5. Another locus is added to the linkage group and the program returns to step 3. 14

15 As an example, if ten loci (A J) are assigned to a linkage group, the program starts by choosing an informative set of five loci and puts them in their most likely order: A B C D E Next, locus F is placed in each possible interval on the map: F A B C D E A F B C D E A B F C D E A B C F D E A B C D F E A B C D E F For each possibility, the maximum likelihood positions of the loci are computed, given the locus order. Thus, for order FABCDE, we estimate the maximum likelihood values for recombination frequencies r FA, r AB, r BC, r CD, and r DE. The likelihood of the maximum likelihood estimates of this multipoint map is remembered. Then, for order AFBCDE we estimate the maximum likelihood values for recombination frequencies r AF, r FB, r BC, r CD, and r DE. The likelihood of the maximum likelihood estimates of this multipoint map is remembered, and so on for each order, and finally the likelihoods for the best maps for each of the six locus orders are compared and the most likely one is chosen. Let s say that order ABCFDE is chosen as most likely. The program then adds the next locus (say, G) to the map, and the procedure repeats, this time comparing likelihoods of maps for orders GABCFDE, AGBCFDE, ABGCFDE, etc. If the best order differs from the next best order by a LOD score of 2 or less, usually, the locus is assigned to the most likely interval without specifying its position (to minimize the chance of establishing an incorrect order during this process). This happens regularly with very tightly linked markers. The critical point is that when one attempts to map numerous loci simultaneously, it is simply not possible to compute likelihoods for all possible locus orders, or even a fraction of all possible locus orders, so users must approach the problem by first eliminating the least likely possible orders quickly as a first step (this is accomplished by establishing linkage groups and when the program chooses the initial set of five informative loci with which to start building the map of a linkage group). It is very possible that the same person starting with the same data set on Mapmaker will end up with at least slightly different linkage maps at the end. It is best to re-analyze the data set several times and choose the most likely order that occurs most frequently. Furthermore, if the genetic data were collected in a different sample of the same 15

16 population, the linkage map developed would probably differ, because of the sampling variance inherent in the recombination frequency estimation procedure. In any case, it is possible to assign loci to linkage groups, and using some cytogenetics tools we will discuss in the next lecture, to assign the loci to actual chromosomes. References Allard RW (1956) Formulas and tables to facilitate the calculation of recombination values in heredity. Hilgardia 24: Hedges BR, Sellner JM, Devine TE, Palmer RG (1990) Assigning isocitrate dehydrogenase to linkage group 11 in soyean. Crop Sci 30: Lander ES, Botstein D (1986) Mapping complex genetic traits in humans: New methods using a complete RFLP linkage map. Cold Spring Harbor Symposia on Quantitative Biology. Vol. LL. Cold Spring Harbor Laboratory, pp Lander ES, Green P (1987) Construction of multilocus genetic linkage maps in humans. Proceedings of the National Academy of Science USA 84: Liu BH (1998) Statistical genomics: Linkage, mapping, and QTL analysis. CRC Press, Boca Raton Mather K (1951) The measurement of linkage in heredity. 2nd Ed. John Wiley and Sons, New York Ott J (1999) Analysis of human genetic linkage. 3rd Ed. Johns Hopkins University Press, Baltimore Sherman JD, Stack SM (1995) Two-dimensional spreads of synaptonemal complexes from solanaceous plants. VI. High-resolution recombination nodule map for tomato (Lycopersicon esculentum). Genetics 141:

Statistical issues in QTL mapping in mice

Statistical issues in QTL mapping in mice Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping

More information

Gene mapping in model organisms

Gene mapping in model organisms Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Human vs mouse Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] www.daviddeen.com

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Lecture 9. QTL Mapping 2: Outbred Populations

Lecture 9. QTL Mapping 2: Outbred Populations Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred

More information

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M. STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).

More information

Use of hidden Markov models for QTL mapping

Use of hidden Markov models for QTL mapping Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing

More information

theta H H H H H H H H H H H K K K K K K K K K K centimorgans

theta H H H H H H H H H H H K K K K K K K K K K centimorgans Linkage Phase Recall that the recombination fraction ρ for two loci denotes the probability of a recombination event between those two loci. For loci on different chromosomes, ρ = 1=2. For loci on the

More information

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines Lecture 8 QTL Mapping 1: Overview and Using Inbred Lines Bruce Walsh. jbwalsh@u.arizona.edu. University of Arizona. Notes from a short course taught Jan-Feb 2012 at University of Uppsala While the machinery

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 UNIT 8 BIOLOGY: Meiosis and Heredity Page 148 CP: CHAPTER 6, Sections 1-6; CHAPTER 7, Sections 1-4; HN: CHAPTER 11, Section 1-5 Standard B-4: The student will demonstrate an understanding of the molecular

More information

Prediction of the Confidence Interval of Quantitative Trait Loci Location

Prediction of the Confidence Interval of Quantitative Trait Loci Location Behavior Genetics, Vol. 34, No. 4, July 2004 ( 2004) Prediction of the Confidence Interval of Quantitative Trait Loci Location Peter M. Visscher 1,3 and Mike E. Goddard 2 Received 4 Sept. 2003 Final 28

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

Chapter 13 Meiosis and Sexual Reproduction

Chapter 13 Meiosis and Sexual Reproduction Biology 110 Sec. 11 J. Greg Doheny Chapter 13 Meiosis and Sexual Reproduction Quiz Questions: 1. What word do you use to describe a chromosome or gene allele that we inherit from our Mother? From our Father?

More information

Heredity and Genetics WKSH

Heredity and Genetics WKSH Chapter 6, Section 3 Heredity and Genetics WKSH KEY CONCEPT Mendel s research showed that traits are inherited as discrete units. Vocabulary trait purebred law of segregation genetics cross MAIN IDEA:

More information

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th Name: Date: Block: Chapter 6 Meiosis and Mendel Section 6.1 Chromosomes and Meiosis 1. How do gametes differ from somatic cells? Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February

More information

Lecture 6. QTL Mapping

Lecture 6. QTL Mapping Lecture 6 QTL Mapping Bruce Walsh. Aug 2003. Nordic Summer Course MAPPING USING INBRED LINE CROSSES We start by considering crosses between inbred lines. The analysis of such crosses illustrates many of

More information

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability Dropping Your Genes A Simulation of Meiosis and Fertilization and An Introduction to To fully understand Mendelian genetics (and, eventually, population genetics), you need to understand certain aspects

More information

Linkage and Linkage Disequilibrium

Linkage and Linkage Disequilibrium Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]

More information

The genomes of recombinant inbred lines

The genomes of recombinant inbred lines The genomes of recombinant inbred lines Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman C57BL/6 2 1 Recombinant inbred lines (by sibling mating)

More information

Ch 11.Introduction to Genetics.Biology.Landis

Ch 11.Introduction to Genetics.Biology.Landis Nom Section 11 1 The Work of Gregor Mendel (pages 263 266) This section describes how Gregor Mendel studied the inheritance of traits in garden peas and what his conclusions were. Introduction (page 263)

More information

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important? Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 3 Exercise 3.. a. Define random mating. b. Discuss what random mating as defined in (a) above means in a single infinite population

More information

Section 11 1 The Work of Gregor Mendel

Section 11 1 The Work of Gregor Mendel Chapter 11 Introduction to Genetics Section 11 1 The Work of Gregor Mendel (pages 263 266) What is the principle of dominance? What happens during segregation? Gregor Mendel s Peas (pages 263 264) 1. The

More information

One-week Course on Genetic Analysis and Plant Breeding January 2013, CIMMYT, Mexico LOD Threshold and QTL Detection Power Simulation

One-week Course on Genetic Analysis and Plant Breeding January 2013, CIMMYT, Mexico LOD Threshold and QTL Detection Power Simulation One-week Course on Genetic Analysis and Plant Breeding 21-2 January 213, CIMMYT, Mexico LOD Threshold and QTL Detection Power Simulation Jiankang Wang, CIMMYT China and CAAS E-mail: jkwang@cgiar.org; wangjiankang@caas.cn

More information

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have. Section 1: Chromosomes and Meiosis KEY CONCEPT Gametes have half the number of chromosomes that body cells have. VOCABULARY somatic cell autosome fertilization gamete sex chromosome diploid homologous

More information

Solutions to Problem Set 4

Solutions to Problem Set 4 Question 1 Solutions to 7.014 Problem Set 4 Because you have not read much scientific literature, you decide to study the genetics of garden peas. You have two pure breeding pea strains. One that is tall

More information

Introduction to Genetics

Introduction to Genetics Chapter 11 Introduction to Genetics Section 11 1 The Work of Gregor Mendel (pages 263 266) This section describes how Gregor Mendel studied the inheritance of traits in garden peas and what his conclusions

More information

When one gene is wild type and the other mutant:

When one gene is wild type and the other mutant: Series 2: Cross Diagrams Linkage Analysis There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome:

More information

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series: Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 7 Exercise 7.. Derive the two scales of relation for each of the two following recurrent series: u: 0, 8, 6, 48, 46,L 36 7

More information

Genetics (patterns of inheritance)

Genetics (patterns of inheritance) MENDELIAN GENETICS branch of biology that studies how genetic characteristics are inherited MENDELIAN GENETICS Gregory Mendel, an Augustinian monk (1822-1884), was the first who systematically studied

More information

MULTIPLE-TRAIT MULTIPLE-INTERVAL MAPPING OF QUANTITATIVE-TRAIT LOCI ROBY JOEHANES

MULTIPLE-TRAIT MULTIPLE-INTERVAL MAPPING OF QUANTITATIVE-TRAIT LOCI ROBY JOEHANES MULTIPLE-TRAIT MULTIPLE-INTERVAL MAPPING OF QUANTITATIVE-TRAIT LOCI by ROBY JOEHANES B.S., Universitas Pelita Harapan, Indonesia, 1999 M.S., Kansas State University, 2002 A REPORT submitted in partial

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4

More information

Eiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6

Eiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6 Yamamoto et al. BMC Genetics 2014, 15:50 METHODOLOGY ARTICLE Open Access Effect of advanced intercrossing on genome structure and on the power to detect linked quantitative trait loci in a multi-parent

More information

PRINCIPLES OF MENDELIAN GENETICS APPLICABLE IN FORESTRY. by Erich Steiner 1/

PRINCIPLES OF MENDELIAN GENETICS APPLICABLE IN FORESTRY. by Erich Steiner 1/ PRINCIPLES OF MENDELIAN GENETICS APPLICABLE IN FORESTRY by Erich Steiner 1/ It is well known that the variation exhibited by living things has two components, one hereditary, the other environmental. One

More information

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda 1 Population Genetics with implications for Linkage Disequilibrium Chiara Sabatti, Human Genetics 6357a Gonda csabatti@mednet.ucla.edu 2 Hardy-Weinberg Hypotheses: infinite populations; no inbreeding;

More information

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes

F1 Parent Cell R R. Name Period. Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes Name Period Concept 15.1 Mendelian inheritance has its physical basis in the behavior of chromosomes 1. What is the chromosome theory of inheritance? 2. Explain the law of segregation. Use two different

More information

Methods for QTL analysis

Methods for QTL analysis Methods for QTL analysis Julius van der Werf METHODS FOR QTL ANALYSIS... 44 SINGLE VERSUS MULTIPLE MARKERS... 45 DETERMINING ASSOCIATIONS BETWEEN GENETIC MARKERS AND QTL WITH TWO MARKERS... 45 INTERVAL

More information

The Genetics of Natural Selection

The Genetics of Natural Selection The Genetics of Natural Selection Introduction So far in this course, we ve focused on describing the pattern of variation within and among populations. We ve talked about inbreeding, which causes genotype

More information

Name Class Date. Pearson Education, Inc., publishing as Pearson Prentice Hall. 33

Name Class Date. Pearson Education, Inc., publishing as Pearson Prentice Hall. 33 Chapter 11 Introduction to Genetics Chapter Vocabulary Review Matching On the lines provided, write the letter of the definition of each term. 1. genetics a. likelihood that something will happen 2. trait

More information

Introduction to Genetics

Introduction to Genetics Introduction to Genetics The Work of Gregor Mendel B.1.21, B.1.22, B.1.29 Genetic Inheritance Heredity: the transmission of characteristics from parent to offspring The study of heredity in biology is

More information

Evolutionary Genetics Midterm 2008

Evolutionary Genetics Midterm 2008 Student # Signature The Rules: (1) Before you start, make sure you ve got all six pages of the exam, and write your name legibly on each page. P1: /10 P2: /10 P3: /12 P4: /18 P5: /23 P6: /12 TOT: /85 (2)

More information

The Admixture Model in Linkage Analysis

The Admixture Model in Linkage Analysis The Admixture Model in Linkage Analysis Jie Peng D. Siegmund Department of Statistics, Stanford University, Stanford, CA 94305 SUMMARY We study an appropriate version of the score statistic to test the

More information

Meiosis and Tetrad Analysis Lab

Meiosis and Tetrad Analysis Lab Meiosis and Tetrad Analysis Lab Objectives: - Explain how meiosis and crossing over result in the different arrangements of ascospores within asci. - Learn how to calculate the map distance between a gene

More information

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate Natural Selection Population Dynamics Humans, Sickle-cell Disease, and Malaria How does a population of humans become resistant to malaria? Overproduction Environmental pressure/competition Pre-existing

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

Statistics 246 Spring 2006

Statistics 246 Spring 2006 Statistics 246 Spring 2006 Meiosis and Recombination Week 3, Lecture 1 1 - the process which starts with a diploid cell having one set of maternal and one of paternal chromosomes, and ends up with four

More information

I Have the Power in QTL linkage: single and multilocus analysis

I Have the Power in QTL linkage: single and multilocus analysis I Have the Power in QTL linkage: single and multilocus analysis Benjamin Neale 1, Sir Shaun Purcell 2 & Pak Sham 13 1 SGDP, IoP, London, UK 2 Harvard School of Public Health, Cambridge, MA, USA 3 Department

More information

CSS 350 Midterm #2, 4/2/01

CSS 350 Midterm #2, 4/2/01 6. In corn three unlinked dominant genes are necessary for aleurone color. The genotypes B-D-B- are colored. If any of these loci is homozygous recessive the aleurone will be colorless. What is the expected

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 009 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

Advanced Algorithms and Models for Computational Biology -- a machine learning approach

Advanced Algorithms and Models for Computational Biology -- a machine learning approach Advanced Algorithms and Models for Computational Biology -- a machine learning approach Population Genetics: meiosis and recombination Eric Xing Lecture 15, March 8, 2006 Reading: DTW book, Meiosis Meiosis

More information

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity, AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity, Today: Review Probability in Populatin Genetics Review basic statistics Population Definition

More information

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white Outline - segregation of alleles in single trait crosses - independent assortment of alleles - using probability to predict outcomes - statistical analysis of hypotheses - conditional probability in multi-generation

More information

Chapter 1 Review of Equations and Inequalities

Chapter 1 Review of Equations and Inequalities Chapter 1 Review of Equations and Inequalities Part I Review of Basic Equations Recall that an equation is an expression with an equal sign in the middle. Also recall that, if a question asks you to solve

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM Life Cycles, Meiosis and Genetic Variability iclicker: 1. A chromosome just before mitosis contains two double stranded DNA molecules. 2. This replicated chromosome contains DNA from only one of your parents

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

Objectives. Announcements. Comparison of mitosis and meiosis

Objectives. Announcements. Comparison of mitosis and meiosis Announcements Colloquium sessions for which you can get credit posted on web site: Feb 20, 27 Mar 6, 13, 20 Apr 17, 24 May 15. Review study CD that came with text for lab this week (especially mitosis

More information

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)

Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) 12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²

More information

Outline for today s lecture (Ch. 14, Part I)

Outline for today s lecture (Ch. 14, Part I) Outline for today s lecture (Ch. 14, Part I) Ploidy vs. DNA content The basis of heredity ca. 1850s Mendel s Experiments and Theory Law of Segregation Law of Independent Assortment Introduction to Probability

More information

Name Period. 2. Name the 3 parts of interphase AND briefly explain what happens in each:

Name Period. 2. Name the 3 parts of interphase AND briefly explain what happens in each: Name Period GENERAL BIOLOGY Second Semester Study Guide Chapters 3, 4, 5, 6, 11, 10, 13, 14, 15, 16, and 17. SEXUAL REPRODUCTION AND MEIOSIS 1. The cell cycle consists of a growth stage and a division

More information

Lesson 4: Understanding Genetics

Lesson 4: Understanding Genetics Lesson 4: Understanding Genetics 1 Terms Alleles Chromosome Co dominance Crossover Deoxyribonucleic acid DNA Dominant Genetic code Genome Genotype Heredity Heritability Heritability estimate Heterozygous

More information

Chapter 5 Simplifying Formulas and Solving Equations

Chapter 5 Simplifying Formulas and Solving Equations Chapter 5 Simplifying Formulas and Solving Equations Look at the geometry formula for Perimeter of a rectangle P = L W L W. Can this formula be written in a simpler way? If it is true, that we can simplify

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

Parts 2. Modeling chromosome segregation

Parts 2. Modeling chromosome segregation Genome 371, Autumn 2017 Quiz Section 2 Meiosis Goals: To increase your familiarity with the molecular control of meiosis, outcomes of meiosis, and the important role of crossing over in generating genetic

More information

Case Studies in Ecology and Evolution

Case Studies in Ecology and Evolution 3 Non-random mating, Inbreeding and Population Structure. Jewelweed, Impatiens capensis, is a common woodland flower in the Eastern US. You may have seen the swollen seed pods that explosively pop when

More information

Mathematics-I Prof. S.K. Ray Department of Mathematics and Statistics Indian Institute of Technology, Kanpur. Lecture 1 Real Numbers

Mathematics-I Prof. S.K. Ray Department of Mathematics and Statistics Indian Institute of Technology, Kanpur. Lecture 1 Real Numbers Mathematics-I Prof. S.K. Ray Department of Mathematics and Statistics Indian Institute of Technology, Kanpur Lecture 1 Real Numbers In these lectures, we are going to study a branch of mathematics called

More information

Name Period. 3. How many rounds of DNA replication and cell division occur during meiosis?

Name Period. 3. How many rounds of DNA replication and cell division occur during meiosis? Name Period GENERAL BIOLOGY Second Semester Study Guide Chapters 3, 4, 5, 6, 11, 14, 16, 17, 18 and 19. SEXUAL REPRODUCTION AND MEIOSIS 1. What is the purpose of meiosis? 2. Distinguish between diploid

More information

6.6 Meiosis and Genetic Variation. KEY CONCEPT Independent assortment and crossing over during meiosis result in genetic diversity.

6.6 Meiosis and Genetic Variation. KEY CONCEPT Independent assortment and crossing over during meiosis result in genetic diversity. 6.6 Meiosis and Genetic Variation KEY CONCEPT Independent assortment and crossing over during meiosis result in genetic diversity. 6.6 Meiosis and Genetic Variation! Sexual reproduction creates unique

More information

Guided Reading Chapter 1: The Science of Heredity

Guided Reading Chapter 1: The Science of Heredity Name Number Date Guided Reading Chapter 1: The Science of Heredity Section 1-1: Mendel s Work 1. Gregor Mendel experimented with hundreds of pea plants to understand the process of _. Match the term with

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

BIOLOGY 321. Answers to text questions th edition: Chapter 2

BIOLOGY 321. Answers to text questions th edition: Chapter 2 BIOLOGY 321 SPRING 2013 10 TH EDITION OF GRIFFITHS ANSWERS TO ASSIGNMENT SET #1 I have made every effort to prevent errors from creeping into these answer sheets. But, if you spot a mistake, please send

More information

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Overview - 1 Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014 Elizabeth Thompson University of Washington Seattle, WA, USA MWF 8:30-9:20; THO 211 Web page: www.stat.washington.edu/ thompson/stat550/

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 013 Population genetics Outline of lectures 3-6 1. We ant to kno hat theory says about the reproduction of genotypes in a population. This results

More information

Exam 1 PBG430/

Exam 1 PBG430/ 1 Exam 1 PBG430/530 2014 1. You read that the genome size of maize is 2,300 Mb and that in this species 2n = 20. This means that there are 2,300 Mb of DNA in a cell that is a. n (e.g. gamete) b. 2n (e.g.

More information

Outline of lectures 3-6

Outline of lectures 3-6 GENOME 453 J. Felsenstein Evolutionary Genetics Autumn, 007 Population genetics Outline of lectures 3-6 1. We want to know what theory says about the reproduction of genotypes in a population. This results

More information

QTL Mapping I: Overview and using Inbred Lines

QTL Mapping I: Overview and using Inbred Lines QTL Mapping I: Overview and using Inbred Lines Key idea: Looking for marker-trait associations in collections of relatives If (say) the mean trait value for marker genotype MM is statisically different

More information

Algebra Exam. Solutions and Grading Guide

Algebra Exam. Solutions and Grading Guide Algebra Exam Solutions and Grading Guide You should use this grading guide to carefully grade your own exam, trying to be as objective as possible about what score the TAs would give your responses. Full

More information

Advance Organizer. Topic: Mendelian Genetics and Meiosis

Advance Organizer. Topic: Mendelian Genetics and Meiosis Name: Row Unit 8 - Chapter 11 - Mendelian Genetics and Meiosis Advance Organizer Topic: Mendelian Genetics and Meiosis 1. Objectives (What should I be able to do?) a. Summarize the outcomes of Gregor Mendel's

More information

Name Date Class CHAPTER 10. Section 1: Meiosis

Name Date Class CHAPTER 10. Section 1: Meiosis Name Date Class Study Guide CHAPTER 10 Section 1: Meiosis In your textbook, read about meiosis I and meiosis II. Label the diagrams below. Use these choices: anaphase I anaphase II interphase metaphase

More information

THE WORK OF GREGOR MENDEL

THE WORK OF GREGOR MENDEL GENETICS NOTES THE WORK OF GREGOR MENDEL Genetics-. - Austrian monk- the father of genetics- carried out his work on. Pea flowers are naturally, which means that sperm cells fertilize the egg cells in

More information

Unit 3 Test 2 Study Guide

Unit 3 Test 2 Study Guide Unit 3 Test 2 Study Guide How many chromosomes are in the human body cells? 46 How many chromosomes are in the sex cells? 23 What are sex cells also known as? gametes What is fertilization? Union of the

More information

Interactive Biology Multimedia Courseware Mendel's Principles of Heredity. Copyright 1998 CyberEd Inc.

Interactive Biology Multimedia Courseware Mendel's Principles of Heredity. Copyright 1998 CyberEd Inc. Interactive Biology Multimedia Courseware Mendel's Principles of Heredity Copyright 1998 CyberEd Inc. Mendel's Principles of Heredity TEACHING OBJECTIVES The following subject areas are illustrated throughout

More information

Unit 7 Genetics. Meiosis

Unit 7 Genetics. Meiosis NAME: 1 Unit 7 Genetics 1. Gregor Mendel- was responsible for our 2. What organism did Mendel study? 3. Mendel stated that physical traits were inherited as 4. Today we know that particles are actually

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2. NCEA Level 2 Biology (91157) 2018 page 1 of 6 Assessment Schedule 2018 Biology: Demonstrate understanding of genetic variation and change (91157) Evidence Q Expected Coverage Achievement Merit Excellence

More information

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.

The phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype. Series 1: Cross Diagrams There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome: When both

More information

UNIT 3: GENETICS 1. Inheritance and Reproduction Genetics inheritance Heredity parent to offspring chemical code genes specific order traits allele

UNIT 3: GENETICS 1. Inheritance and Reproduction Genetics inheritance Heredity parent to offspring chemical code genes specific order traits allele UNIT 3: GENETICS 1. Inheritance and Reproduction Genetics the study of the inheritance of biological traits Heredity- the passing of traits from parent to offspring = Inheritance - heredity is controlled

More information

BIO 682 Nonparametric Statistics Spring 2010

BIO 682 Nonparametric Statistics Spring 2010 BIO 682 Nonparametric Statistics Spring 2010 Steve Shuster http://www4.nau.edu/shustercourses/bio682/index.htm Lecture 5 Williams Correction a. Divide G value by q (see S&R p. 699) q = 1 + (a 2-1)/6nv

More information

-Genetics- Guided Notes

-Genetics- Guided Notes -Genetics- Guided Notes Chromosome Number The Chromosomal Theory of Inheritance genes are located in specific on chromosomes. Homologous Chromosomes chromosomes come in, one from the male parent and one

More information

Biol. 303 EXAM I 9/22/08 Name

Biol. 303 EXAM I 9/22/08 Name Biol. 303 EXAM I 9/22/08 Name -------------------------------------------------------------------------------------------------------------- This exam consists of 40 multiple choice questions worth 2.5

More information

CALCULATING LINKAGE INTENSITIES FROM Fa DATA* Received April 10, 1933

CALCULATING LINKAGE INTENSITIES FROM Fa DATA* Received April 10, 1933 CALCULATING LINKAGE INTENSITIES FROM Fa DATA* F. R. IMMER' Ofice of Sugar Plant Investigations, Bureau of Plant Industry, U. S. De#artment of Agriculture, University Farm, St. Paul, Minnesota Received

More information

Class Copy! Return to teacher at the end of class! Mendel's Genetics

Class Copy! Return to teacher at the end of class! Mendel's Genetics Class Copy! Return to teacher at the end of class! Mendel's Genetics For thousands of years farmers and herders have been selectively breeding their plants and animals to produce more useful hybrids. It

More information

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin CHAPTER 1 1.2 The expected homozygosity, given allele

More information

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS 016064 BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS TEST CODE: 016064 Directions: Each of the questions or incomplete statements below is followed by five suggested answers or completions.

More information

Genetics Review Sheet Learning Target 11: Explain where and how an organism inherits its genetic information and this influences their

Genetics Review Sheet Learning Target 11: Explain where and how an organism inherits its genetic information and this influences their Genetics Review Sheet Learning Target 11: Explain where and how an organism inherits its genetic information and this influences their characteristics. 1. Define the following terms: Name Block a. Heredity

More information

Mendelian Genetics. Introduction to the principles of Mendelian Genetics

Mendelian Genetics. Introduction to the principles of Mendelian Genetics + Mendelian Genetics Introduction to the principles of Mendelian Genetics + What is Genetics? n It is the study of patterns of inheritance and variations in organisms. n Genes control each trait of a living

More information

Chapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype.

Chapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype. Chapter 2: Extensions to Mendel: Complexities in Relating Genotype to Phenotype. please read pages 38-47; 49-55;57-63. Slide 1 of Chapter 2 1 Extension sot Mendelian Behavior of Genes Single gene inheritance

More information