APM 541: Stochastic Modelling in Biology Probability Notes. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 77

Size: px
Start display at page:

Download "APM 541: Stochastic Modelling in Biology Probability Notes. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 77"

Transcription

1 APM 541: Stochastic Modelling in Biology Probability Notes Jay Taylor Fall 2013 Jay Taylor (ASU) APM 541 Fall / 77

2 Outline Outline 1 Motivation 2 Probability and Uncertainty 3 Conditional Probability 4 Discrete Distributions 5 Continuous Distributions 6 Multivariate Distributions 7 Limit Theorems for Sums of Random Variables Jay Taylor (ASU) APM 541 Fall / 77

3 Motivation Chance and Contingency Influence Biological Processes at all Scales quantum biology: stochastic tunneling in photosynthetic pathways stochastic molecular dynamics: molecular motors as Brownian ratchets noise in biochemical and gene regulatory networks developmental biology: stochastic X-inactivation leads to mosaic phenotypes in female mammals (tortoiseshell cats) behavioral ecology: optimal foraging in uncertain environments population ecology: extinction risk of small populations depends on demographic and environmental stochasticity population genetics: allele frequencies fluctuate because of random genetic drift and random mutation phylogenetics: random DNA substitutions are used for phylogenetic inference Jay Taylor (ASU) APM 541 Fall / 77

4 Motivation Stochastic X-inactivation produces the mottled coat pattern on tortoiseshell cats. Mass extinctions have had profound and often unpredictable consequences for life on earth. Jay Taylor (ASU) APM 541 Fall / 77

5 Probability and Uncertainty Multiple Interpretations of Probability Frequentist interpretation (J. Venn): The probability of an event is equal to its limiting frequency in an infinite series of independent, identical trials. Propensity interpretation (C. Pierce, K. Popper): The probability of an event is equal to the physical propensity of the event. (Confession: I have no idea what this means.) Logical interpretation (J. M. Keynes, R. Cox, E. Jaynes): Probabilities quantify the degree of support that some evidence E gives to a hypothesis H. Subjective interpretation (F. Ramsey, J. von Neumann, B. de Finetti): The probability of a proposition is equal to the degree of an individual s belief that the proposition is true. Jay Taylor (ASU) APM 541 Fall / 77

6 Probability and Uncertainty A Behavioristic Derivation of the Laws of Probability We consider a two-player game, with players P 1 and P 2, that make wagers on an event with a then-unknown outcome. Here we will assume that the event can result in one of n mutually-exclusive and exhaustive outcomes, e 1,, e n. The game takes place in three steps. 1 First, player P 1 assigns a unit price Pr(A) to each possible event A S = {e 1,, e n}. This is the amount that P 1 is willing to pay for a $1 wager on the event that A happens, as well as the amount that P 1 is willing to accept from P 2 in exchange for such a wager. The player selling such a wager agrees to pay the purchaser $1 if A occurs and $0 if it does not occur. 2 Player P 2 then decides the amount S(A) that they wish to wager on each event A. If S(A) is positive, then P 2 purchases a wager worth $S for S(A) Pr(A) from P 1, while if S(A) is negative, then P 2 sells such a wager to P 1 for the same amount. 3 The outcome of the event is then determined and the two players settle their wagers. Jay Taylor (ASU) APM 541 Fall / 77

7 Probability and Uncertainty Although player P 1 may assign any set of prices to the various possible wagers, many assignments can be considered irrational insofar as they lead to certain loss for P 1. Indeed, a set of prices {Pr(A) : A S} and a set of wagers {S(A) : A S} is said to be a Dutch book if these will guarantee that P 2 will earn a profit from P 1 no matter which outcome occurs. The following are examples of Dutch books. Case 1: Suppose that Pr(A) < 0 for some event A. In this case, P 2 can agree to purchase a $1 wager from P 1 at a cost of Pr(A), i.e., P 1 will pay P 2 an amount Pr(A) to take such a wager. Then P 2 will have earned 1 + Pr(A) if the event A occurs and will have earned Pr(A) if the event A does not occur. Thus P 2 makes a profit no matter what occurs and thus this is a Dutch book. Case 2: Suppose that Pr(S) < 1. In this case, P 2 can earn a profit of 1 Pr(S) > 0 by purchasing a $1 wager on S. Since one of the events e 1,, e n is certain to happen, it follows that S will occur no matter which event occurs so that P 2 is certain to make a profit of 1 Pr(S) > 0. Similarly, if Pr(S) > 1, then P 2 should sell a $1 stake in S to P 1 and this will guarantee that P 2 will earn a profit of Pr(S) 1 > 0. Jay Taylor (ASU) APM 541 Fall / 77

8 Probability and Uncertainty Case 3: Now suppose that A and B are disjoint subsets of S, i.e., AB A B =, and that Pr(A B) < Pr(A) + Pr(B). Assume that P 2 purchases from P 1 a $1 wager on the event A B and sells to P 1 both a $1 wager on A and a $1 wager on B. Then the profit/losses incurred by P 2 are summarized in the following table: A B A B total outcome A B Pr(A) 1 Pr(B) 1 Pr(A B) Pr(A) + Pr(B) Pr(A B) ĀB Pr(A) Pr(B) 1 1 Pr(A B) Pr(A) + Pr(B) Pr(A B) Ā B Pr(A) Pr(B) Pr(A B) Pr(A) + Pr(B) Pr(A B) Notice that we do not consider the outcome AB since we are assuming that A and B represent mutually exclusive events. It follows that P 2 earns a profit no matter which outcome occurs and so P 2 has made Dutch book against P 1. Similarly, if Pr(A B) > Pr(A) + Pr(B), then P 2 can make a Dutch book by selling a $1 wager on A B and purchasing a $1 wager on each event A and B. Jay Taylor (ASU) APM 541 Fall / 77

9 Probability and Uncertainty The above examples show that a Dutch book can be made against P 1 whenever the prices assigned by P 1 to events in S satisfy certain inequalities. This motivates the following definition. The set of prices (Pr(A) : A S) is said to be coherent if it satisfies the following conditions: 1 Pr(A) 0 for each A S; 2 Pr(S) = 1; 3 Pr(A B) = Pr(A) + Pr(B) whenever A and B are disjoint subsets of S. Thus our previous examples show that any set of prices that is not coherent is vulnerable to a Dutch book. In fact, the converse is also true: under the rules of the above game, P 2 cannot make a Dutch book against P 1 if the prices assigned by P 1 are coherent. Jay Taylor (ASU) APM 541 Fall / 77

10 Probability and Uncertainty Condition (3) implies that a coherent price assignment is finitely-additive. If A 1,, A m are disjoint subsets of S, then Pr! m[ A i = i=1 mx Pr(A i ). i=1 Indeed, this follows by recursion on m since we can use (3) to peel off sets A m, A m 1, one by one from the union inside the price function:!!! m[ m 1 [ m 1 [ Pr A i = Pr A i A m = Pr A i + Pr(A m) i=1 = Pr = i=1 m 2 [ i=1 mx Pr(A i ). i=1 i=1 A i! + Pr(A m 1) + Pr(A m) Jay Taylor (ASU) APM 541 Fall / 77

11 Probability and Uncertainty If we identify subjective probabilities with wagers and we accept that rational actors should assign probabilities in such a way that prevents a Dutch book from being made against them, then it follows that subjective probabilities should obey the usual axioms of frequentist probability. However, this argument has several weaknesses/limitations: Our arguments in favor of coherence presume that rational individuals should act in such a way to avoid being made a sure looser. However, there may be situations in which this is not true, e.g., when individuals act altruistically and make sacrifices to benefit others. Our arguments also assume that it is possible for the players to eventually know what the true outcome is. In many instances where we wish to apply probabilistic reasoning, such knowledge might never be available. We have restricted attention to settings with only finitely-many possible outcomes. However, it is often useful to consider probabilities on infinite sets. Jay Taylor (ASU) APM 541 Fall / 77

12 Probability and Uncertainty Most modern mathematical treatments of probability begin with Kolmogorov s notion of a probability space. Definition A probability space is a triple (Ω, F, P) consisting of the following objects: 1 A set Ω called the sample space. 2 A collection F of subsets of S, called events, that satisfies the following conditions: Ω F; if A F, then Ā F; if A 1, A 2, F, then i=1 A i F. 3 A function P : F R satisfying the following conditions: P(A) 0 for every A F; P(Ω) ( = 1; ) P i 1 A i = i 1 P(A i) for any countable collection of disjoint sets A i F. Jay Taylor (ASU) APM 541 Fall / 77

13 Probability and Uncertainty Some terminology: A collection F of sets satisfying the conditions listed under (2) is said to be a σ-algebra on Ω. A set in F is said to be measurable and the pair (Ω, F) is called a measure space. The function P defined in (3) is said to be a probability measure or probability distribution on F. Since P is defined on F, the probability of a set P(A) is only defined if A is a measurable set. Probability measures are countably additive. This is a stronger condition than the finite additivity required for coherence. Jay Taylor (ASU) APM 541 Fall / 77

14 Probability and Uncertainty The axioms of probability imply several useful properties: P(Ā) = 1 P(A); P( ) = 0; If A B, then P(A) P(B). P(A B) = P(A) + P(B) P(AB); General inclusion-exclusion formula:! n[ nx P A i = P(A i ) X P(A i A j ) + X P(A i A j A k ) + i=1 i=1 i<j ( 1) n+1 P(A 1A 2 A n). i<j<k Exercise: Show that these properties follow from the definition. Jay Taylor (ASU) APM 541 Fall / 77

15 Conditional Probability Conditional Probabilities Suppose that we have specified a probability distribution P on a space Ω and that we learn that an event B Ω has occurred. How should we update our beliefs concerning the probabilities of other events A Ω? We will answer this question with the help of the game-theoretic approach used above. To do so, we allow the players to place wagers on conditional events. For example, suppose that player P 2 pays Pr(A B) to place a wager of $1 on the conditional event that A occurs given that B has occurred, denoted A B. Depending on the outcome, P 2 will incur the following costs and profits: If B occurs and A occurs, then P 1 keeps the fee Pr(A B) but pays $1 to P 2. If B occurs but A does not occur, then P 1 keeps the fee Pr(A B) and pays nothing to P 2. If B does not occur, then P 1 returns the fee Pr(A B) to P 2 and neither player makes a profit. Jay Taylor (ASU) APM 541 Fall / 77

16 Conditional Probability Suppose that the prices assigned by P 1 to the wagers AB, A B, and B satisfy the inequality Pr(AB) < Pr(A B)Pr(B). I claim that P 2 can make a Dutch book against P 1 by purchasing a wager of $1 on AB and selling wagers of $1 on A B and $Pr(A B) on B. The following table summarizes the profits made by P 2: AB A B B 1-1 Pr(A B) total event B Pr(AB) 0 Pr(A B)Pr(B) Pr(A B)Pr(B) Pr(AB) ĀB Pr(AB) Pr(A B) Pr(A B)Pr(B) Pr(A B) Pr(A B)Pr(B) Pr(AB) AB 1 Pr(AB) Pr(A B) 1 Pr(A B)Pr(B) Pr(A B) Pr(A B)Pr(B) Pr(AB) Provided that the prices set by P 1 satisfy the inequality given above, it is clear that P 2 will earn a profit no matter which outcome occurs and so P 2 has created a Dutch book. Similarly, if Pr(AB) > Pr(A B)Pr(B), then P 2 can create a Dutch book by purchasing wagers on A B and B and selling a wager on AB. Jay Taylor (ASU) APM 541 Fall / 77

17 Conditional Probability Thus, to avoid falling victim to a Dutch book, the prices assigned by P 1 to these three events must satisfy the identity: Pr(AB) = Pr(A B)Pr(B). This result motivates the following definition. Definition Let A and B be events in a probability space (Ω, F, P) and suppose that P(B) > 0. Then the conditional probability P(A B) of A given B is defined to be P(A B) = P(AB) P(B). Jay Taylor (ASU) APM 541 Fall / 77

18 Conditional Probability Independence In general, there is no simple relationship between the probabilities P(A), P(B) and P(AB). However, there is an important special case where the three probabilities are related by a simple identity. Definition Events A and B are independent if P(AB) = P(A)P(B). Events A 1, A 2, A 3, are independent if every finite collection {A i1, A i2,, A in } of distinct events satisfies the identity: P! n\ ny A ij = P `A ij. j=1 j=1 Jay Taylor (ASU) APM 541 Fall / 77

19 Conditional Probability Colloquially, we say that two events are independent if there is no causal or logical relationship between them, i.e., knowing that B is true does not change the likelihood that A is true. Indeed, if A and B are independent and P(A) > 0 and P(B) > 0, then P(A B) = P(AB) P(B) P(B A) = P(AB) P(A) = P(A)P(B) P(B) = P(A)P(B) P(A) = P(A) = P(B), which shows that our formal definition of independence is consistent with this heuristic interpretation. However, the formal definition is slightly broader in that it applies even when one or more of the events has probability 0. Jay Taylor (ASU) APM 541 Fall / 77

20 Conditional Probability The Law of Total Probability Suppose that F 1,, F n are mutually exclusive events and that B is an event contained in the union F 1 F n. Then P(B) = = nx P(BF i ) i=1 nx P(B F i )P(F i ). The latter identity is known as the law of total probability and can often be used to calculate the probabilities of complex events. i=1 Jay Taylor (ASU) APM 541 Fall / 77

21 Conditional Probability Bayes Formula Suppose that E and F are two events with P(E) > 0 and P(F ) > 0. Since EF = FE, it follows that the we can express the probability of the intersection EF in two ways: P(E F )P(F ) = P(EF ) = P(F E)P(E). The equality of the first and third expressions in this identity leads to Bayes formula: P(F E) = P(F ) P(E F ) P(E). This is arguably one of the most important identities in all of probability theory and, as the name suggests, it plays a fundamental role in Bayesian statistics. Jay Taylor (ASU) APM 541 Fall / 77

22 Discrete Distributions Random Variables Suppose that (Ω, F, P) is a probability space that represents our beliefs concerning the state of some system. In many cases, it will not be possible to directly observe which state ω Ω the system occupies, but it will be possible to conduct experiments that provide some information about this state. Mathematically, we can model such an experiment by a function defined on the sample space which takes values in a set E. X : Ω E Since the value of the variable X (ω) depends on the unknown state ω of the system, we say that X is a random variable. If we wish to emphasize the range of X, then we say that X is an E-valued random variable. Jay Taylor (ASU) APM 541 Fall / 77

23 Discrete Distributions Example: Ω might represent the unknown positions and types of all of the molecules in a population of cells at some point in time. We have little chance of determining all of this information, but we could conduct a microarray experiment to determine the population-averaged levels of gene expression across the genome. In this case, we could take E = [0, ) G, where G is the number of genes included in our study, and let X (ω) = (X 1(ω),, X G (ω)) denote the estimated mrna levels. Of course, we can also think of each of the quantities X 1, X 2, as a random variable in own right, taking values in the space [0, ). In other words, we can define many random variables on the same space. Jay Taylor (ASU) APM 541 Fall / 77

24 Discrete Distributions Definition Suppose that X : Ω E is a random variable defined on a probability space (Ω, F, P). The distribution of X is the probability distribution P X on E defined by the following identity: P X (A) P(X A) = P(X 1 (A)). Notice that the two expressions on the right-hand side are determined by the probability measure P on Ω, i.e., the distribution of a random variable depends on the probability distribution on the underlying space on which the variable is defined. Technical aside: To make this definition rigorous, we need to introduce a σ-algebra E on E and require that X 1 (A) F whenever A E. X is said to be measurable with respect to the two σ-algebras F and E if this condition is satisfied. The distribution of X is then a probability measure on the measurable space (E, E). Jay Taylor (ASU) APM 541 Fall / 77

25 Discrete Distributions Discrete Random Variables Definition A random variable X is said to be discrete if X takes values in a set E that is either finite or countably infinite. The probability mass function of a discrete random variable X is the function p X : E [0, 1] defined by the formula p X (e) = P(X = e). Example: A random variable X with values in the set E = {0, 1} and probability mass function p X (1) = p, p X (0) = 1 p is said to be a Bernoulli random variable with parameter p. Jay Taylor (ASU) APM 541 Fall / 77

26 Discrete Distributions The probability mass function of a discrete random variable completely determines its distribution via the following identity: P(X A) = X x A p X (x), A E. In other words, to calculate the probability that X belongs in A, we simply sum the probability mass function over all of the values in A. In particular, if the probability mass function is summed over the entire space, the sum must be equal to 1: X p X (x) = P(X E) = 1. x E Jay Taylor (ASU) APM 541 Fall / 77

27 Discrete Distributions Expectations of Discrete Random Variables Definition If X is a discrete random variable with values in a subset of the real numbers, then the expected value or mean of X is the weighted average of these values: E[X ] = X x E p X (x) x. Example: If X is Bernoulli with parameter p, then the expected value of X is E[X ] = p 1 + (1 p) 0 = p. Notice that the expected value of a random variable need not be a value that variable can assume. Jay Taylor (ASU) APM 541 Fall / 77

28 Discrete Distributions Properties of Expectations The following two properties are often used to compute expected values: Linearity: nx E[c 1X c nx n] = c i E[X i ] Transformations: If f : E R is a real-valued function, then E[f (X )] = X p(x) f (x) x E i=1 Caveat: If f is a non-linear function, then in general E[f (X )] f (E[X ]). Jay Taylor (ASU) APM 541 Fall / 77

29 Discrete Distributions Variance Definition If X is a discrete random variable with values in a subset of the real numbers, then the variance of X is the weighted average of the squared difference between X and its mean: Var(X ) E[(X E[X ]) 2 ] = X x E p X (x) (x E[X ]) 2. In practice, it is often more convenient to calculate the variance using the following formula: Var(X ) = E[X 2 ] E[X ] 2. Example: If X Bernoulli(p), then, since X 2 = X, Var(X ) = p p 2 = p(1 p). Jay Taylor (ASU) APM 541 Fall / 77

30 Discrete Distributions Definition X is said to have the binomial distribution with parameters n 1 and p [0, 1], written X Binomial(n, p), if X takes values in the set E = {0, 1,, n} with probability mass function! P(X = k) = n p k (1 p) n k, k = 0, 1,, n. k Furthermore, the mean and the variance of X are: E[X ] = np Var(X ) = np(1 p). Application: Suppose that we perform a series of n independent, identical (IID) trials, each of which results in a success with probability p or a failure with probability 1 p. Then the total number of successes in the n trials is a binomial random variable with parameters n and p. In particular, a Bernoulli random variable with parameter p is also a binomial random variable with parameters n = 1 and p. Jay Taylor (ASU) APM 541 Fall / 77

31 Discrete Distributions Definition X is said to have the geometric distribution with parameter p [0, 1], written X Geometric(p), if X takes values in the non-negative integers E = {0, 1, } with probability mass function P(X = k) = (1 p) k p, k 0. Furthermore, the mean and the variance of X are: E[X ] = 1 p p Var(X ) = 1 p p 2. Application: Suppose that we perform a series of IID trials, each of which results in a success with probability p. Then the number of failures that occur before we observe the first success is geometrically distributed with parameter p. Alternate definitions: Some authors define the geometric distribution to be the number of trials required to obtain the first success, in which case X takes values in the set E = {1, 2, } and P(X = k) = (1 p) k 1 p. Jay Taylor (ASU) APM 541 Fall / 77

32 Discrete Distributions Definition X is said to have the negative binomial distribution with parameters r > 0 and p [0, 1], written X NB(r, p), if X takes values in the non-negative integers E = {0, 1, } with probability mass function! P(X = k) = r + k 1 p r (1 p) k, k 0. k Furthermore, the mean and the variance of X are: E[X ] = r(1 p) p Var(X ) = r(1 p) p 2. Application: Suppose that we perform a series of IID trials, each of which results in a success with probability p. Then the number of failures that occur before the r th success is a negative binomial random variable with parameters r and p. Caveat: As with the geometric distribution, some authors define the negative binomial distribution in terms of the number of trials until the r th success, in which case X takes values in the set E = {r, r + 1, }. Jay Taylor (ASU) APM 541 Fall / 77

33 Discrete Distributions The following plot shows the probability mass function for a series of negative binomial distributions, with fixed success probability p = 0.5 and increasing r = 0.2, 1, 5, 10. When r = 1, this distribution reduces to a geometric distribution. However, as r increases, the distribution becomes more symmetric (less skewed) around its mean. 0.9 NegativeBinomial Distributions r =0.2 r =1 r =5 r= p(k) Geometric(0.5) k Jay Taylor (ASU) APM 541 Fall / 77

34 Discrete Distributions Definition X is said to have the Poisson distribution with parameter λ 0, written X Poisson(λ), if X takes values in the non-negative integers E = {0, 1, } with probability mass function P(X = k) = e λ λ k k!, k 0. Furthermore, the mean and the variance of X are: E[X ] = Var(X ) = λ. Application: Suppose that we perform a large number n of IID trials, each with success probability p = λ/n. Then the total number of successes is approximately Poisson distributed with parameter λ and this approximation becomes exact in the limit as n. This is a special case of the Law of Rare Events. Jay Taylor (ASU) APM 541 Fall / 77

35 Discrete Distributions The Poisson distribution provides a useful approximation for the distribution of many kinds of count data. Some examples include: The number of misspelled words on a page of a book. The number of misdialed phone numbers in a city on a particular day. The number of beta particles emitted by a 14 C source in an hour. The global number of major earthquakes that occur in a year. The number of sites in a non-coding region that differ between two closely-related species. The mean of this distribution is 2LµT, where L is the number of base pairs in the region, µ is the substitution rate per site per year, and T is the divergence time in years. Jay Taylor (ASU) APM 541 Fall / 77

36 Discrete Distributions The Poisson distribution is less suited for modeling count data corresponding to events or individuals that are clumped. In such cases, the data are usually overdispersed, i.e., the variance is greater than the mean, and the negative binomial distribution may provide a better fit. p(k) NegativeBinomial vs. Poisson (mean=4) Poisson(5) r =1 r =2 r =5 r = NB: σ 2 = µ + 1 r µ2 r is an inverse measure of aggregation or dispersion: smaller values of r give a more skewed distribution If we let r with p chosen so that the mean is µ, then NB(r, p) Poisson(µ) k examples: numbers of macro parasites per individual, numbers of infections per outbreak are usually modeled with the negative binomial distribution Jay Taylor (ASU) APM 541 Fall / 77

37 Continuous Distributions Continuous Random Variables Definition A real-valued random variable X is said to be continuous if there is a non-negative function p : R [0, ], called the probability density function of X, such that Z P(X A) = p(x)dx, for any (nice) subset A R. A Technical aside: A set is nice if it is belongs to the Borel σ-algebra on R. This is the smallest σ-algebra on R that contains all open intervals. While we will largely ignore measurability in this course, this is an important concept in fully rigorous treatments of probability theory. See David Williams book Probability with Martingales (1991). Jay Taylor (ASU) APM 541 Fall / 77

38 Continuous Distributions The probability density function plays the same role for continuous random variables that the probability mass function plays for discrete random variables. Be careful, however, not to confuse the two. The existence of a density function has several consequences: Densities integrate to 1: Z R p(x)dx = P(X R) = 1. Points have zero probability mass: P(X = t) = Z t t p(x)dx = 0. These two identities seem to lead to a paradox. On the one hand, the probability that X is equal to any particular point x is zero for every x. On the other hand, the probability that X is equal to some point, i.e., X R, is 1. However, there is no contraction here, since there are uncountably infinitely many real numbers and we have not required that probability distributions be additive over arbitrary collections of disjoint sets:! 1 = P(X R) = P [ x R{X = x} X x R P(X = x) = 0. Jay Taylor (ASU) APM 541 Fall / 77

39 Continuous Distributions Definition If X is a real-valued random variable (not necessarily continuous), the cumulative distribution function of X is the function F : R [0, 1] defined by F (x) = P(X x). If X is continuous, then the density p(x) and the cumulative distribution function F (x) are related in the following way: F (x) = Z x p(t)dt and p(x) = F (x). Furthermore, the density of X at a point x can be estimated by the following formulas: p(x) P(x ɛ < X x + ɛ) 2ɛ = F (x + ɛ) F (x ɛ). 2ɛ Jay Taylor (ASU) APM 541 Fall / 77

40 Continuous Distributions Expectations of Continuous Random Variables Definition If X is a continuous random variable with probability density function p(x), then the expected value or mean of X is the weighted average of these values: Z E[X ] = x p(x)dx. R In general, expectations of continuous random variables behave like expectations of discrete random variables provided that we replace the sum by an integral and the probability mass function by the probability density function. For example, if f : R R is a real-valued function and X is as in the definition, then Z E[f (X )] = f (x) p(x)dx. R Jay Taylor (ASU) APM 541 Fall / 77

41 Continuous Distributions Definition X is said to have the uniform distribution on (l, u), written X U(l, u), if X takes values in the set E = (l, u) with probability density function p(x) = 1 u l, Furthermore, the mean and the variance of X are: E[X ] = l + u 2 x (l, u). Var(X ) = (u l)2. 12 If l = 0 and u = 1, then X is said to be a standard uniform random variable. Application: In Monte Carlo simulations, we are usually given a stream of independent standard uniform (pseudo-)random variables and our problem is to use these to generate a stream of independent random variables with the distribution of interest. Jay Taylor (ASU) APM 541 Fall / 77

42 Continuous Distributions Definition X is said to have the Beta distribution with parameters a, b > 0, written X Beta(a, b), if X takes values in the set E = (0, 1) with probability density function p(x) = 1 β(a, b) x a 1 (1 x) b 1, x (0, 1). Here, β(a, b) is the Beta function, which is defined by the integral β(a, b) = Z 1 Furthermore, the mean and the variance of X are: 0 x a 1 (1 x) b 1 dx. E[X ] = a a + b Var(X ) = ab (a + b) 2 (a + b + 1). Remark: If a = b = 1, then the Beta distribution reduces to the standard uniform distribution. Jay Taylor (ASU) APM 541 Fall / 77

43 Continuous Distributions Application: The Beta distribution provides a flexible family of probability distributions for quantities that take values in the interval [0, 1], e.g., proportions or probabilities. The figure below shows that, depending on whether the parameters are less than or greater than 1, the Beta density will either be bimodal, with maxima at x = 0, 1, or unimodal, with a mode at (a 1)/(a + b 2). 3.5 Beta distribution p(x) a =b =0.1 a =b =2 Other applications include: order statistics for IID uniform variables posterior distribution of success probabilities equilibrium frequencies of neutral alleles in panmictic populations x Jay Taylor (ASU) APM 541 Fall / 77

44 Continuous Distributions Definition X is said to have the exponential distribution with rate parameter λ > 0, written X Exp(λ), if X takes values in the set E = [0, ) with probability density function p(x) = λe λx, x 0. Furthermore, the mean and the variance of X are E[X ] = 1 λ Var(X ) = 1 λ 2, while the cumulative distribution function is F (t) = P(X t) = 1 e λt. Application: Exponential random variables are often used to model random times, e.g., the time until a radioactive nucleus decays. Of course, the shape of such a distribution should not depend on the units of measurement and this is true of the exponential distribution. In particular, if X is exponentially distributed with rate λ and Y = γx, then Y is exponentially distributed with rate λ/γ. Jay Taylor (ASU) APM 541 Fall / 77

45 Continuous Distributions Exponential distributions are special in that they are the only continuous distributions on [0, ) which are memoryless, i.e., if X Exp(λ) and t, s > 0, then P(X > t + s X > t) = P(X > t + s) P(X > t) = e λ(t+s) e λt = e λs = = P(X < s). In other words, the variable has no memory of having survived until time t: the probability that it dies between times t and t + s is the same as the probability of dying between times 0 and s. Put another way, the rate at which the variable dies is constant in time. Later we will see that this property plays a crucial role in the construction and analysis of continuous-time Markov chains, where the waiting times between successive jumps are exponential. Jay Taylor (ASU) APM 541 Fall / 77

46 Continuous Distributions There is an important connection between the geometric distribution and the exponential distribution. Suppose that for each n 1, X n is a geometric random variable with success probability p n = λ/n. As n increases, the expected value E[X n] = n/λ diverges. Suppose then that we consider the rescaled random variables Y n = 1 Xn. Each of these n variables has mean E[Y n] = λ 1. Furthermore, if we let n tend to infinity, then P(Y n > t) = P(X n > nt) = n e λt = P(X > t) 1 λ «nt n where X is exponentially distributed with rate λ. In other words, we can approximate a geometric random variable with small success probability by an exponential random variable provided that we change the units in which time is measured. Jay Taylor (ASU) APM 541 Fall / 77

47 Continuous Distributions Definition X is said to have the gamma distribution with shape parameter α > 0 and rate parameter λ > 0, written X Gamma(α, λ), if X takes values in the set E = [0, ) with probability density function p(x) = λα Γ(α) x α 1 e λx, x 0. Here, Γ(α) is the Gamma function, which is defined for α > 0 by the integral Γ(α) = Z Furthermore, the mean and the variance of X are: 0 x α 1 e x dx. E[X ] = α λ Var(X ) = α λ 2. Remark: Sometimes the Gamma distribution is described in terms of the shape parameter α and a scale parameter θ = 1/λ. Jay Taylor (ASU) APM 541 Fall / 77

48 Continuous Distributions The Gamma function introduced on the preceding slide is an important object in its own right and appears in many settings in mathematics and statistics. Although Γ(α) usually cannot be explicitly evaluated, the following identity is often useful: Γ(α + 1) = Z 0 x α e x dx = x α ( e x ) 0 = αγ(α). In particular, if α = n is an integer, then Z 0 αx α 1 e x dx Γ(n + 1) = nγ(n) = n(n 1)Γ(n 2) = = n!γ(1) = n! since Γ(1) = 1. Thus the Gamma function can be regarded as a smooth extension of the factorial function to the positive real numbers. Jay Taylor (ASU) APM 541 Fall / 77

49 Continuous Distributions The Gamma distribution is related to the exponential distribution in much the same way that the negative binomial distribution is related to the geometric distribution. 3 Gamma Distributions (rate =1) Other applications include: p(x) Exponential(1) a =0.5 a =1 a =2 a = x When α = 1, the Gamma distribution reduces to the exponential. If X 1,, X n are independent exponential RV s with rate λ, then their sum X = X X n is a Gamma RV with shape parameter α = n and rate λ. Thus the Gamma distribution is often used to model random lifespans that elapse after a series of independent events. Jay Taylor (ASU) APM 541 Fall / 77

50 Continuous Distributions Definition X is said to have the normal (or Gaussian) distribution with mean µ and variance σ 2 > 0, written X N (µ, σ 2 ), if X takes values in R with probability density function p(x) = 1 σ 2 /2σ 2 2π e (x µ). In this case, the mean and the variance of X are µ and σ 2, as implied by the name of the distribution. Furthermore, if µ = 0 and σ 2 = 1, then X is said to be a standard normal random variable. Applications: Normal distributions arise as the limiting distribution of a sum of a large number of independent random variables. This is the content of the Central Limit Theorem which we will discuss in the next lecture. Many quantities encountered in biological systems are approximately normally distributed, including some morphological traits (among individuals in a population) and some protein concentrations (per cell). Jay Taylor (ASU) APM 541 Fall / 77

51 Continuous Distributions Some useful properties of the normal distribution include: Sums of independent normal RV s are normal: If X 1,, X n are independent normal RV s and X i N (µ i, σ 2 i ), then their sum X = X X n is a normal random variable with mean µ µ n and variance σ σ 2 n. 0.8 Normal Distributions p(x) σ 2 =0.25 σ 2 =1 σ 2 =4 σ 2 =25 Linear transforms preserve normality. If X N (µ, σ 2 ), then Y = ax + b is normal with mean aµ + b and variance a 2 σ 2. If Z N (0, 1), then X = σz + µ N (µ, σ 2 ) x Jay Taylor (ASU) APM 541 Fall / 77

52 Multivariate Distributions Multivariate Distributions and Random Vectors Definition Suppose that X 1,, X n are random variables defined on a common probability space (Ω, F, P) with values in the sets E 1,, E n, respectively. Then the joint distribution of these variables is the distribution of the random vector X = (X 1,, X n) on the set E = E 1 E n, i.e., P(X A) = P((X 1,, X n) A) for any A E. In this case, the distribution of any one of the variables, say X i, considered on its own is said to be the marginal distribution of that variable. The joint distribution of a collection of random vectors tells us how the variables are related to one another. Jay Taylor (ASU) APM 541 Fall / 77

53 Multivariate Distributions Although we can always recover the marginal distributions of a collection of random variables from their joint distribution, the inverse operation usually is not possible without additional information. One case where this is possible is when the random variables are independent. Definition 1 The random variables X 1,, X n are said to be independent if P(X 1 E 1,, X n E n) = ny P(X i E i ) 1 for all sets E 1,, E n such that the events {X i E i } are well-defined. 2 An infinite collection of random variables is said to be independent if every finite sub-collection is independent. In other words, X and Y are independent if and only if the events {X E} and {Y F } are independent for all sets E and F in the ranges of X and Y. Jay Taylor (ASU) APM 541 Fall / 77

54 Multivariate Distributions As a general rule, calculations involving multivariate distributions are greatly simplified when the component variables are independent. The following theorem provides an important example of this principle. Theorem Suppose that X 1,, X n are independent real-valued random variables and that f 1,, f n are functions from R to R. Then f 1(X 1),, f n(x n) are independent real-valued random variables and " ny # ny E f i (X i ) = E [f i (X i )], i=1 whenever the expectations on both sides of the identity are defined. In particular, by letting each f i (x) = x be the identity function we obtain the following important special case: " ny # ny E X i = E [X i ]. i=1 i=1 i=1 Jay Taylor (ASU) APM 541 Fall / 77

55 Multivariate Distributions Of course, it is often the case that variables of interest are not independent and then we need metrics to quantify the extent to which they depend on each other. One such metric is the covariance. Definition Suppose that X and Y are real-valued random variables defined on the same probability space. The covariance of X and Y is the quantity Cov(X, Y ) = E [(X E[X ])(Y E[Y ])] = E[XY ] E[X ]E[Y ]. The covariance between two random variables is a measure of their linear association. In particular, if X and Y are independent, then since the theorem on the previous slide shows that E[XY ] = E[X ]E[Y ], it follows that Cov(X, Y ) = 0. However, the converse of this result is not true: the mere fact that two variables have zero covariance does not imply that they are independent. Jay Taylor (ASU) APM 541 Fall / 77

56 Multivariate Distributions Properties of Covariance Covariances have a number of useful properties: Cov(X, X ) = Var(X ); Cov(X, Y ) = Cov(Y, X ); Cov(aX, by ) = ab Cov(X, Y ); Bilinearity:! nx mx Cov X i, Y j = Cauchy-Schwartz inequality: i=1 j=1 nx mx Cov(X i, Y j ); i=1 j=1 Cov(X, Y ) p Var(X ) Var(Y ). Exercise: Verify the above properties. Jay Taylor (ASU) APM 541 Fall / 77

57 Multivariate Distributions When we work with random vectors containing more than two variables, it is often useful to organize the covariances into a single matrix. Definition Suppose that X 1,, X n are real-valued random variables and let σ ij = Cov(X i, X j ) denote the covariance of X i and X j. Then the variance-covariance matrix of the random vector X = (X 1,, X n) is the n n matrix Σ with entry σ ij in the i th column and j th row: 0 1 σ 11 σ 12 σ 1n σ 21 σ 22 σ 2n Σ = C.. A. σ n1 σ n2 σ nn Because Cov(X, Y ) = Cov(Y, X ), it is clear that every variance-covariance matrix is symmetric. Furthermore, it can be shown that any such matrix is also non-negative definite, i.e., given any (column) vector β R n, we have β T Σβ = Var! nx β i X i 0. i=1 Jay Taylor (ASU) APM 541 Fall / 77

58 Multivariate Distributions Definition Suppose that X 1,, X n are discrete random variables with values in the sets E 1,, E n. Then the random vector X = (X 1,, X n) is itself a discrete random variable and the joint probability mass function of the variables X 1,, X n is defined by the formula p(x 1,, x n) = P {X = (x 1,, x n)} = P{X 1 = x 1,, X n = x n}. Furthermore, the marginal probability mass function p i (x) of the variable X i can be recovered from the joint p.m.f. by summing over all other variables: p i (y) = X p(x 1,, x i 1, y, x i+1,, x n). {x E:x i =y} If the random variables X 1,, X n are independent, then the joint probability mass function is equal to the product of the marginal probability mass functions: p(x 1,, x n) = ny p i (x i ) where p i (x i ) = P(X i = x i ) is the probability mass function of X i. i=1 Jay Taylor (ASU) APM 541 Fall / 77

59 Multivariate Distributions Definition Let n 1 and let p 1,, p k be a collection of non-negative real numbers such that p p k = 1. We say that the random vector X = (X 1,, X k ) has the multinomial distribution with parameters n and (p 1,, p k ) if each of the variables X i takes values in the set {0,, n} and if the joint probability mass function of these variables is given by! provided n n k = n. p(n 1,, n k ) = n n 1,, n k p n 1 1 pn k k Application: As the name suggests, the multinomial distribution generalizes the binomial distribution. Suppose that we conduct n IID trials and that each trial can result in one of k possible outcomes, which have probabilities p 1,, p k. If X i denotes the number of trials that result in the i th outcome, then (X 1,, X k ) has the multinomial distribution with parameters n and (p 1,, p k ). Jay Taylor (ASU) APM 541 Fall / 77

60 Multivariate Distributions Definition A random vector X = (X 1,, X n) with values in R n is said to be continuously distributed if there is a non-negative function p : R n [0, ] such that Z Z P(X A) = p(x)dx. for any (Borel measurable) subset A R n. In this case, each of the variables X i is individually continuous and the marginal density of X i can be found by integrating the joint density over all of the other components: p i (y) = Z Z A p(x 1,, x i 1, y, x i+1,, x n)dx 1 dx i 1 dx i+1 dx n. In general, a random vector formed from a set of marginally continuous random variables need not be jointly continuous. However, if the variables are also independent, then the random vector will be jointly continuous and the joint density is given by the product of the marginal densities: ny p(x 1,, x n) = p i (x i ). i=1 Jay Taylor (ASU) APM 541 Fall / 77

61 Multivariate Distributions Definition A continuous random vector X = (X 1,, X n) with values in R n is said to have the multivariate normal distribution with mean vector µ = (µ 1,, µ n) and n n variance-covariance matrix Σ if it has joint density function j p(x) = (2π) n/2 Σ 1/2 exp 1 ff 2 (x µ)t Σ 1 (x µ). Here Σ is a positive-definite symmetric matrix with determinant Σ > 0 and inverse Σ 1. We note that if X is multivariate normal, then every component variable X i is a normal random variable. Furthermore, if X and Y are multivariate normal random variables with values in R n, then so is X + Y, i.e., sums of multivariate normal random variables are also multivariate normal. Jay Taylor (ASU) APM 541 Fall / 77

62 Limit Theorems for Sums of Random Variables Sums of Independent Random Variables Many problems in applied probability and statistics involve sums of independent random variables. For example, in discrete-time branching processes, the size of a population at time t + 1 is equal to the sum of the number of offspring born to each adult female alive at time t. In some cases, the distribution of the sum of two independent integer-valued random variables can be calculated with the help of the following result. Theorem Suppose that X and Y are independent integer-valued random variables with probability mass functions p X and p Y, respectively. Then Z = X + Y is an integer-valued random variable with probability mass function p Z (n) = p X p Y (n) X k= p X (k)p Y (n k) Remark: The operation p X p Y is called the discrete convolution of p X and p Y. Jay Taylor (ASU) APM 541 Fall / 77

63 Limit Theorems for Sums of Random Variables The proof of this result uses only elementary properties of probability distributions and independence: p Z (n) = P(Z = n) = P(X + Y = n)! [ = P {X = k, Y = n k} k= k= X = P(X = k, Y = n k) X = P(X = k)p(y = n k) k= X = p X (k)p Y (n k). k= Jay Taylor (ASU) APM 541 Fall / 77

64 Limit Theorems for Sums of Random Variables By way of example, we can show that the sum of two independent Poisson random variables is Poisson. Suppose that X Poisson(λ) and Y Poisson(µ) are independent and let Z = X + Y. Then the probability mass function of Z is p Z (n) = = X k= p X (k)p Y (n k) nx «e λ λ e k µ µ n k k=0 = e (λ+µ) 1 n! k! nx k=0 = e (λ+µ) (λ + µ) n n! (n k)! «n! k!(n k)! λk µ n k which shows that X + Y Poisson(λ + µ). This result will be very useful when we begin to work with Poisson processes. Jay Taylor (ASU) APM 541 Fall / 77

65 Limit Theorems for Sums of Random Variables An analogous result exists for sums of independent continuous random variables. Theorem Let X and Y be independent continuous random variables with densities p X and p Y, respectively. Then Z = X + Y is a continuous random variable with density p Z (z) = p X p Y (z) Z p X (t)p Y (z t)dt and p X p Y (z) is called the convolution integral of p X and p Y. Exercise: Use this theorem to show that the sum of two independent exponential random variables with rate parameter λ is Gamma distributed with shape parameter α = 2 and rate parameter λ. Jay Taylor (ASU) APM 541 Fall / 77

66 Limit Theorems for Sums of Random Variables When summing large numbers of independent random variables, the convolutions described on the preceding slides may not be helpful. In these cases, we instead consider the asymptotic or limiting behavior of such sums as the number of terms tends to infinity. To discuss these results, we need to decide what it means for a sequence of random variables or a sequence of probability distributions to converge to a limit. We first recall the definition of convergence in R n. Definition We say that a sequence {x n : n 1} R n of n-dimensional vectors converges to a limit x R n if for every ɛ > 0 there exists an integer N ɛ such that for all n N ɛ we have x x n < ɛ, where y denotes the n-dimensional Euclidean norm. Technical aside: In this definition, the Euclidean norm on R n can be replaced by any other norm on R n and this will not change whether a sequence is convergent or not. Jay Taylor (ASU) APM 541 Fall / 77

67 Limit Theorems for Sums of Random Variables When we turn to spaces of random variables or probability distributions, there is no longer a single mode of convergence that is useful in all settings. The most important modes of convergence are described in the next definition. Definition Suppose that X 1, X 2, is a sequence of R n -valued random variables defined on a probability space and let X also be a R n -valued random variable on this space. 1 We say that X n converges in distribution to X if for every bounded, continuous function f : R n R, lim E[f (Xn)] = E[f (X )] n 2 We say that X n converges in probability to X if for every ɛ > 0 lim P { Xn X > ɛ} = 0. n 3 We say that X n converges almost surely to X if n o P lim n Xn = X = 1. Jay Taylor (ASU) APM 541 Fall / 77

68 Limit Theorems for Sums of Random Variables The modes of convergence defined on the preceding slide differ in the sense that there are sequences that converge in one mode that don t converge in the other modes. However, there are some useful relationships between these notions. In particular, the following is true: Theorem Let X n; n 1 and X be as in the previous definition. 1 If X n converges to X almost surely, then X n also converges to X in probability and in definition. 2 If X n converges to X in probability, then X n also converges to X in distribution. In other words, almost sure convergence implies convergence in probability which implies convergence in distribution. Jay Taylor (ASU) APM 541 Fall / 77

69 Limit Theorems for Sums of Random Variables With these results in mind, let us return to sums of independent random variables. We first consider the asymptotic behavior of the sample mean of a large number of such variables. Recall that we use the abbreviation i.i.d. to stand for independent and identically-distributed. Theorem (The Weak Law of Large Numbers, WLLN) Suppose that X 1, X 2, are i.i.d. real-valued random variables and that E X 1 <. If µ = EX 1 and S n = X X n is the sum of the first n variables, then the sequence of sample means S n/n converges in probability to µ, i.e., for every ɛ > 0 ff lim P 1 n j n Sn µ > ɛ = 0. In other words, if we conduct a large number of independent, identical trials in which we measure some random quantity, then with high probability the sample mean will be close to the actual mean. Jay Taylor (ASU) APM 541 Fall / 77

70 Limit Theorems for Sums of Random Variables In fact, a stronger result is available. Theorem (The Strong Law of Large Numbers) Suppose that X 1, X 2, are i.i.d. real-valued random variables and that E X 1 <. If µ = EX 1 and S n = X X n, then the sequence of sample means S n/n converges almost surely to µ, i.e., j P lim n 1 n Sn = µ ff = 1. The SLLN tells us that as we conduct more and more trials, the sample mean is certain to approach the true mean. The SLLN implies the WLLN, but is a much stronger result. Jay Taylor (ASU) APM 541 Fall / 77

71 Limit Theorems for Sums of Random Variables The WLLN and SLLN are special cases of a more general heuristic, which states that deterministic behavior can emerge in certain kinds of stochastic models that consist of a large number of weakly interacting (or indeed independent) components. For example, it is this heuristic that allows us to use deterministic ODE s or PDE s to model the behavior of chemical reactions when the numbers of molecules is large. Nonetheless, in many instances, we are specifically interested in the noise or fluctuations of the system about this deterministic limit. In the context of sums of i.i.d. variables, this is addressed by the Central Limit Theorem. Theorem (Central Limit Theorem, CLT) Suppose that X 1, X 2, are i.i.d. real-valued random variables and that both the mean µ = EX 1 and the variance σ 2 = Var(X 1) are finite. If S n = X X n and Z n = 1 σ (Sn nµ) n then the sequence Z 1, Z 2, converges in distribution to a standard normal random variable. Jay Taylor (ASU) APM 541 Fall / 77

72 Limit Theorems for Sums of Random Variables The variables Z 1, Z 2, introduced in the CLT are said to be standardized in the sense that for every n 1, EZ n = 0 and Var(Z n) = 1. In other words, since we already know that the sample means S n/n are converging almost surely to the mean µ, to study the fluctuations of S n/n around this limit we need to subtract the limit and then amplify the differences S n/n µ by a factor that grows rapidly enough to compensate for the fact that this difference is tending to 0: Z n = n σ «1 n Sn µ What the CLT tells us is that no matter what the distribution of the X i s is (as long as it has finite mean and finite variance), these amplified differences will be approximately normally distributed when n is large. This observation presumably explains why so many quantities are approximately normally distributed - in effect, the microscopic details are lost to the Gaussian limit whenever we consider a macroscopic system in which the components act additively.. Jay Taylor (ASU) APM 541 Fall / 77

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65 APM 504: Probability Notes Jay Taylor Spring 2015 Jay Taylor (ASU) APM 504 Fall 2013 1 / 65 Outline Outline 1 Probability and Uncertainty 2 Random Variables Discrete Distributions Continuous Distributions

More information

APM 421 Probability Theory Probability Notes. Jay Taylor Fall Jay Taylor (ASU) Fall / 62

APM 421 Probability Theory Probability Notes. Jay Taylor Fall Jay Taylor (ASU) Fall / 62 APM 421 Probability Theory Probability Notes Jay Taylor Fall 2013 Jay Taylor (ASU) Fall 2013 1 / 62 Motivation Scientific Determinism Scientific determinism holds that we can exactly predict how a system

More information

Lectures for APM 541: Stochastic Modeling in Biology. Jay Taylor

Lectures for APM 541: Stochastic Modeling in Biology. Jay Taylor Lectures for APM 541: Stochastic Modeling in Biology Jay Taylor November 3, 2011 Contents 1 Distributions, Expectations, and Random Variables 4 1.1 Probability Spaces...................................

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Lecture 11. Probability Theory: an Overveiw

Lecture 11. Probability Theory: an Overveiw Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Tom Salisbury

Tom Salisbury MATH 2030 3.00MW Elementary Probability Course Notes Part V: Independence of Random Variables, Law of Large Numbers, Central Limit Theorem, Poisson distribution Geometric & Exponential distributions Tom

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Lecture 1: Review on Probability and Statistics

Lecture 1: Review on Probability and Statistics STAT 516: Stochastic Modeling of Scientific Data Autumn 2018 Instructor: Yen-Chi Chen Lecture 1: Review on Probability and Statistics These notes are partially based on those of Mathias Drton. 1.1 Motivating

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

Advanced topics from statistics

Advanced topics from statistics Advanced topics from statistics Anders Ringgaard Kristensen Advanced Herd Management Slide 1 Outline Covariance and correlation Random vectors and multivariate distributions The multinomial distribution

More information

Topic 2: Review of Probability Theory

Topic 2: Review of Probability Theory CS 8850: Advanced Machine Learning Fall 2017 Topic 2: Review of Probability Theory Instructor: Daniel L. Pimentel-Alarcón c Copyright 2017 2.1 Why Probability? Many (if not all) applications of machine

More information

Lecture 16. Lectures 1-15 Review

Lecture 16. Lectures 1-15 Review 18.440: Lecture 16 Lectures 1-15 Review Scott Sheffield MIT 1 Outline Counting tricks and basic principles of probability Discrete random variables 2 Outline Counting tricks and basic principles of probability

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Review of Basic Probability Theory

Review of Basic Probability Theory Review of Basic Probability Theory James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) 1 / 35 Review of Basic Probability Theory

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

2.1 Elementary probability; random sampling

2.1 Elementary probability; random sampling Chapter 2 Probability Theory Chapter 2 outlines the probability theory necessary to understand this text. It is meant as a refresher for students who need review and as a reference for concepts and theorems

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems

More information

Lecture 17: The Exponential and Some Related Distributions

Lecture 17: The Exponential and Some Related Distributions Lecture 7: The Exponential and Some Related Distributions. Definition Definition: A continuous random variable X is said to have the exponential distribution with parameter if the density of X is e x if

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

EE514A Information Theory I Fall 2013

EE514A Information Theory I Fall 2013 EE514A Information Theory I Fall 2013 K. Mohan, Prof. J. Bilmes University of Washington, Seattle Department of Electrical Engineering Fall Quarter, 2013 http://j.ee.washington.edu/~bilmes/classes/ee514a_fall_2013/

More information

Statistics 1B. Statistics 1B 1 (1 1)

Statistics 1B. Statistics 1B 1 (1 1) 0. Statistics 1B Statistics 1B 1 (1 1) 0. Lecture 1. Introduction and probability review Lecture 1. Introduction and probability review 2 (1 1) 1. Introduction and probability review 1.1. What is Statistics?

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

Brief Review of Probability

Brief Review of Probability Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions

More information

STAT Chapter 5 Continuous Distributions

STAT Chapter 5 Continuous Distributions STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range

More information

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27 Probability Review Yutian Li Stanford University January 18, 2018 Yutian Li (Stanford University) Probability Review January 18, 2018 1 / 27 Outline 1 Elements of probability 2 Random variables 3 Multiple

More information

3. Review of Probability and Statistics

3. Review of Probability and Statistics 3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture

More information

Probability and random variables. Sept 2018

Probability and random variables. Sept 2018 Probability and random variables Sept 2018 2 The sample space Consider an experiment with an uncertain outcome. The set of all possible outcomes is called the sample space. Example: I toss a coin twice,

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

1 Review of Probability

1 Review of Probability 1 Review of Probability Random variables are denoted by X, Y, Z, etc. The cumulative distribution function (c.d.f.) of a random variable X is denoted by F (x) = P (X x), < x

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Lecture 1. ABC of Probability

Lecture 1. ABC of Probability Math 408 - Mathematical Statistics Lecture 1. ABC of Probability January 16, 2013 Konstantin Zuev (USC) Math 408, Lecture 1 January 16, 2013 1 / 9 Agenda Sample Spaces Realizations, Events Axioms of Probability

More information

Intro to Probability. Andrei Barbu

Intro to Probability. Andrei Barbu Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Continuous random variables

Continuous random variables Continuous random variables Continuous r.v. s take an uncountably infinite number of possible values. Examples: Heights of people Weights of apples Diameters of bolts Life lengths of light-bulbs We cannot

More information

Probability Background

Probability Background CS76 Spring 0 Advanced Machine Learning robability Background Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu robability Meure A sample space Ω is the set of all possible outcomes. Elements ω Ω are called sample

More information

Chapter 1. Sets and probability. 1.3 Probability space

Chapter 1. Sets and probability. 1.3 Probability space Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability

More information

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued Chapter 3 sections Chapter 3 - continued 3.1 Random Variables and Discrete Distributions 3.2 Continuous Distributions 3.3 The Cumulative Distribution Function 3.4 Bivariate Distributions 3.5 Marginal Distributions

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

APM 421 Probability Theory Discrete Random Variables. Jay Taylor Fall Jay Taylor (ASU) Fall / 86

APM 421 Probability Theory Discrete Random Variables. Jay Taylor Fall Jay Taylor (ASU) Fall / 86 APM 421 Probability Theory Discrete Random Variables Jay Taylor Fall 2013 Jay Taylor (ASU) Fall 2013 1 / 86 Outline 1 Motivation 2 Infinite Sets and Cardinality 3 Countable Additivity 4 Discrete Random

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Department of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences

Department of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences Outline Advanced topics from statistics Anders Ringgaard Kristensen Covariance and correlation Random vectors and multivariate distributions The multinomial distribution The multivariate normal distribution

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. STAT 302 Introduction to Probability Learning Outcomes Textbook: A First Course in Probability by Sheldon Ross, 8 th ed. Chapter 1: Combinatorial Analysis Demonstrate the ability to solve combinatorial

More information

Introduction to Probability

Introduction to Probability LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute

More information

Continuous Probability Spaces

Continuous Probability Spaces Continuous Probability Spaces Ω is not countable. Outcomes can be any real number or part of an interval of R, e.g. heights, weights and lifetimes. Can not assign probabilities to each outcome and add

More information

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted

More information

Probability Review. Gonzalo Mateos

Probability Review. Gonzalo Mateos Probability Review Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ September 11, 2018 Introduction

More information

Stat 426 : Homework 1.

Stat 426 : Homework 1. Stat 426 : Homework 1. Moulinath Banerjee University of Michigan Announcement: The homework carries 120 points and contributes 10 points to the total grade. (1) A geometric random variable W takes values

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

Lecture notes for Part A Probability

Lecture notes for Part A Probability Lecture notes for Part A Probability Notes written by James Martin, updated by Matthias Winkel Oxford, Michaelmas Term 017 winkel@stats.ox.ac.uk Version of 5 September 017 1 Review: probability spaces,

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Lectures on Statistical Data Analysis

Lectures on Statistical Data Analysis Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

CSE 312 Final Review: Section AA

CSE 312 Final Review: Section AA CSE 312 TAs December 8, 2011 General Information General Information Comprehensive Midterm General Information Comprehensive Midterm Heavily weighted toward material after the midterm Pre-Midterm Material

More information

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr. Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick

More information

MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM

MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM YOUR NAME: KEY: Answers in Blue Show all your work. Answers out of the blue and without any supporting work may receive no credit even if they

More information

The Wright-Fisher Model and Genetic Drift

The Wright-Fisher Model and Genetic Drift The Wright-Fisher Model and Genetic Drift January 22, 2015 1 1 Hardy-Weinberg Equilibrium Our goal is to understand the dynamics of allele and genotype frequencies in an infinite, randomlymating population

More information

Refresher on Discrete Probability

Refresher on Discrete Probability Refresher on Discrete Probability STAT 27725/CMSC 25400: Machine Learning Shubhendu Trivedi University of Chicago October 2015 Background Things you should have seen before Events, Event Spaces Probability

More information

Mathematical Statistics 1 Math A 6330

Mathematical Statistics 1 Math A 6330 Mathematical Statistics 1 Math A 6330 Chapter 3 Common Families of Distributions Mohamed I. Riffi Department of Mathematics Islamic University of Gaza September 28, 2015 Outline 1 Subjects of Lecture 04

More information

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions

Introduction to Statistical Data Analysis Lecture 3: Probability Distributions Introduction to Statistical Data Analysis Lecture 3: Probability Distributions James V. Lambers Department of Mathematics The University of Southern Mississippi James V. Lambers Statistical Data Analysis

More information

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A ) 6. Brownian Motion. stochastic process can be thought of in one of many equivalent ways. We can begin with an underlying probability space (Ω, Σ, P) and a real valued stochastic process can be defined

More information

MATH 556: PROBABILITY PRIMER

MATH 556: PROBABILITY PRIMER MATH 6: PROBABILITY PRIMER 1 DEFINITIONS, TERMINOLOGY, NOTATION 1.1 EVENTS AND THE SAMPLE SPACE Definition 1.1 An experiment is a one-off or repeatable process or procedure for which (a there is a well-defined

More information

Bandits, Experts, and Games

Bandits, Experts, and Games Bandits, Experts, and Games CMSC 858G Fall 2016 University of Maryland Intro to Probability* Alex Slivkins Microsoft Research NYC * Many of the slides adopted from Ron Jin and Mohammad Hajiaghayi Outline

More information

THE QUEEN S UNIVERSITY OF BELFAST

THE QUEEN S UNIVERSITY OF BELFAST THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M

More information

Introduction to Machine Learning

Introduction to Machine Learning What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events.

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. 1 Probability 1.1 Probability spaces We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. Definition 1.1.

More information