Estimating viability

Similar documents
The Genetics of Natural Selection

Analyzing the genetic structure of populations: a Bayesian approach

Evolution of quantitative traits

The neutral theory of molecular evolution

Selection on multiple characters

Resemblance among relatives

Population genetics snippets for genepop

Selection Page 1 sur 11. Atlas of Genetics and Cytogenetics in Oncology and Haematology SELECTION

The Hardy-Weinberg Principle and estimating allele frequencies

Solving with Absolute Value

The Derivative of a Function

Introduction to quantitative genetics

A Primer on Statistical Inference using Maximum Likelihood

Genetics and Natural Selection

(Write your name on every page. One point will be deducted for every page without your name!)

N H 2 2 NH 3 and 2 NH 3 N H 2

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

Introduction to population genetics & evolution

Problems for 3505 (2011)

Lesson 6-1: Relations and Functions

Introduction to Survey Analysis!

URN MODELS: the Ewens Sampling Lemma

Lecture Notes in Population Genetics

Outline of lectures 3-6

Chapter 10. Prof. Tesler. Math 186 Winter χ 2 tests for goodness of fit and independence

Voltage, Current, and Resistance

Outline of lectures 3-6

Evolutionary Genetics Midterm 2008

One-sample categorical data: approximate inference

The Wright-Fisher Model and Genetic Drift

Intermediate Algebra. Gregg Waterman Oregon Institute of Technology

Quadratic Equations Part I

Natural Selection results in increase in one (or more) genotypes relative to other genotypes.

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

Chapter 5 Simplifying Formulas and Solving Equations

MODULE NO.22: Probability

Choosing among models

Getting Started with Communications Engineering

Linear Regression (1/1/17)

Reading Selection: How do species change over time?

Active loads in amplifier circuits

Algebra. Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Simultaneous equations for circuit analysis

Knots, Coloring and Applications

Lecture Notes in Population Genetics

Chapter 1 Review of Equations and Inequalities

Fitting a Straight Line to Data

STA 431s17 Assignment Eight 1

Stat 101: Lecture 12. Summer 2006

Atomic structure. Resources and methods for learning about these subjects (list a few here, in preparation for your research):

Dropping Your Genes. A Simulation of Meiosis and Fertilization and An Introduction to Probability

Direct Proofs. the product of two consecutive integers plus the larger of the two integers

Chapter 18. Sampling Distribution Models /51

Evolution of Populations. Chapter 17

Classification and Regression Trees

Outline of lectures 3-6

Basic algebra and graphing for electric circuits

Genetical theory of natural selection

These are my slides and notes introducing the Red Queen Game to the National Association of Biology Teachers meeting in Denver in 2016.

Math101, Sections 2 and 3, Spring 2008 Review Sheet for Exam #2:

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

Darwinian Selection. Chapter 7 Selection I 12/5/14. v evolution vs. natural selection? v evolution. v natural selection

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

Lecture 11: Extrema. Nathan Pflueger. 2 October 2013

Pair Hidden Markov Models

Specific resistance of conductors

Functional divergence 1: FFTNS and Shifting balance theory

MIT BLOSSOMS INITIATIVE

Family Trees for all grades. Learning Objectives. Materials, Resources, and Preparation

AEC 550 Conservation Genetics Lecture #2 Probability, Random mating, HW Expectations, & Genetic Diversity,

Evolution Module. 6.2 Selection (Revised) Bob Gardner and Lev Yampolski

Chapter 5 Simplifying Formulas and Solving Equations

MA103 STATEMENTS, PROOF, LOGIC

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Two-sample inference: Continuous Data

Foundations of (Theoretical) Computer Science

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Choosing priors Class 15, Jeremy Orloff and Jonathan Bloom

Population Genetics & Evolution

DIFFERENTIAL EQUATIONS

Algebraic substitution for electric circuits

Algebraic substitution for electric circuits

These next few slides correspond with 23.4 in your book. Specifically follow along on page Use your book and it will help you!

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

Algebraic equation manipulation for electric circuits

Neutral Theory of Molecular Evolution

Section 4.6 Negative Exponents

2 Analogies between addition and multiplication

Probability and the Second Law of Thermodynamics

Chapter 13 Meiosis and Sexual Reproduction

Evolution. Part 1: Historical Perspective on the Theory of Natural Selection

Selection on Correlated Characters (notes only)

ML Testing (Likelihood Ratio Testing) for non-gaussian models

STAT 201 Chapter 5. Probability

One-Way Tables and Goodness of Fit

Math Fundamentals for Statistics I (Math 52) Unit 7: Connections (Graphs, Equations and Inequalities)

Park School Mathematics Curriculum Book 9, Lesson 2: Introduction to Logarithms

Iterated Functions. Tom Davis November 5, 2009

Quantitative Trait Variation

Transcription:

Estimating viability Introduction Being able to make predictions with known (or estimated) viabilities, doesn t do us a heck of a lot of good unless we can figure out what those viabilities are. Fortunately, figuring them out isn t too hard. 1 If we know the number of individuals of each genotype before selection, it s really easy as a matter of fact. 2 Consider that our data looks like this: Number in zygotes n (z) 11 n (z) n (z) Viability w w Number in adults n (a) 11 n (z) 11 n (a) w n (z) n (a) w n (z) In other words, estimating the absolute viability simply consists of estimating the probability that an individuals of each genotype that survive from zygote to adult. The maximumlikelihood estimate is, of course, just what you would probably guess: w ij n(a) ij n (z) ij, Since w ij is a probability and the outcome is binary (survive or die), you should be able to guess what kind of likelihood relates the observed data to the unseen parameter, namely, a binomial likelihood. In JAGS notation: 3 n.11.adult ~ dbin(w.11, n.11.zygote) n..adult ~ dbin(w., n..zygote) n..adult ~ dbin(w., n..zygote) 1 I almost said that it was easy, but that would be going a bit too far. 2 And in the very next sentence I contradicted the last footnote. But it really is easy to estimate viabilities if we can genotype individuals before and after selection. 3 You knew you were going to see this again, didn t you? c 2001-2019 Kent E. Holsinger

Estimating relative viability To estimate absolute viabilities, we have to be able to identify genotypes non-destructively, because we have to know what their genotype was both before the selection event and after the selection event.that s fine if we happen to be dealing with an experimental situation where we can do controlled crosses to establish known genotypes or if we happen to be studying an organism and a trait where we can identify the genotype from the phenotype of a zygote (or at least a very young individual) and from surviving adults. 4 What do we do when we can t follow the survival of individuals with known genotype? Give up? 5 Remember that to make inferences about how selection will act, we only need to know relative viabilities, not absolute viabilities. 6 We still need to know something about the genotypic composition of the population before selection, but it turns out that if we re only interested in relative viabilities, we don t need to follow individuals. All we need to be able to do is to score genotypes and estimate genotype frequencies before and after selection. Our data looks like this: We also know that Frequency in zygotes x (z) 11 x (z) x (z) Frequency in adults 11 11 x (z) 11 / w w x (z) / w w x (z) / w. Suppose we now divide all three equations by the middle one: 11 / x (z) 11 /w x (z) 1 1 / w x (z) /w x (z), 4 How many organisms and traits can you think of that satisfy this criterion? Any? There is one other possibility: If we can identify an individual s genotype after it s dead and if we can construct a random sample that includes both living and dead individuals and if we the probability of including an individual in the sample doesn t depend on whether that individual is dead or alive, then we can sample a population after the selection event and score genotypes both before and after the event from one set of observations. 5 Would I be asking the question if the answer were Yes? 6 At least that s true until we start worrying about how selection and drift interact. 2

or, rearranging a bit w w w x(a) x(z) x (z) x(a) 11 x(z) (1) x (z) 11. (2) This gives us a complete set of relative viabilities. Relative viability w 1 w w If we use the maximum-likelihood estimates for genotype frequencies before and after selection, we obtain maximum likelihood estimates for the relative viabilities. 7 If we use Bayesian methods to estimate genotype frequencies before and after selection (including the uncertainty around those estimates), we can use these formulas to get Bayesian estimates of the relative viabilities (and the uncertainty around them). An example Let s see how this works with some real data from Dobzhansky s work on chromosome inversion polymorphisms in Drosophila pseudoobscura. 8 Genotype ST/ST ST/CH CH/CH Total Number in larvae 41 82 27 150 Number in adults 57 169 29 255 You may be wondering how the sample of adults can be larger than the sample of larvae. That s because to score an individual s inversion type, Dobzhansky had to kill it. The numbers in larvae are based on a sample of the population, and the adults that survived 7 If anyone cares, it s because of the invariance property of maximum-likelihod estimates. If you don t understand what that is, don t worry about it, just trust me. 8 Taken from [1]. 3

were not genotyped as larvae. As a result, all we can do is to estimate the relative viabilities. w w w ( ) ( ) x(a) 11 x(z) 57/255 82/150 0.67 x (z) 169/255 41/150 11 ) x(a) x(z) x (z) ( 29/255 169/255 ) ( 82/150 27/150 0.52. So it looks as if we have balancing selection, i.e., the fitness of the heterozygote exceeds that of either homozygote. We can check to see whether this conclusion is statistically justified by comparing the observed number of individuals in each genotype category in adults with what we d expect if all genotypes were equally likely to survive. Genotype ST/ST ST/CH CH/CH ( ) ( ) ( ) Expected 41 150 255 82 150 255 27 150 255 69.7 139.4 45.9 Observed 57 169 29 χ 2 2 14.82, P < 0.001 So we have strong evidence that genotypes differ in their probability of survival. We can also use our knowledge of how selection works to predict the genotype frequencies at equilibrium: w 1 s 1 w w 1 s 2. So s 1 0.33, s 2 0.48, and the predicted equilibrium frequency of the ST chromosome is s 2 /(s 1 + s 2 ) 0.59. Now all of those estimates are maximum-likelihood estimates. Doing these estimates in a Bayesian context is relatively straightforward and the details will be left as an excerise. 9 In outline we simply 1. Estimate the gentoype frequencies before and after selection as samples from a multinomial. 9 In past years Project #3 has consisted of making Bayesian estimates of viabilities from data like these and predicting the outcome of viability selection. 4

2. Apply the formulas from equations (1) and (2) to calculate relative viabilities and selection coefficients. 3. Determine whether the 95% credible intervals for s 1 or s 2 overlap 0. 10 4. Calculate the equilibrium frequency from s 2 /(s 1 +s 2 ), if s 1 > 0 and s 2 > 0. 11 Otherwise, determine which fixation state will be approached. In the end you then have not only viability estimates and their associated uncertainties, but a prediction about the ultimate composition of the population, associated with an accompanying level of uncertainty. References [1] Th. Dobzhansky. Genetics of natural populations. XIV. A response of certain gene arrangements in the third chromosome of Drosophila pseudoobscura to natural selection. Genetics, 32:142 160, 1947. Creative Commons License These notes are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA. 10 Meaning that we don t have good evidence for selection either for or against the associated homozygotes, relative to the heterozygote. 11 In practice, this gets a little complicated. What typically happens is that in some samples from the posterior the heterozygote is intermediate in fitness, meaning that one of the two homozygotes is unconditionally favored. That makes calculating the posterior distribution for the equilibrium frequency a bit complicated. We ll avoid those complications in this year s projecct. 5