Lectures for APM 541: Stochastic Modeling in Biology. Jay Taylor

Size: px
Start display at page:

Download "Lectures for APM 541: Stochastic Modeling in Biology. Jay Taylor"

Transcription

1 Lectures for APM 541: Stochastic Modeling in Biology Jay Taylor November 3, 2011

2 Contents 1 Distributions, Expectations, and Random Variables Probability Spaces Conditional Probabilities Discrete Random Variables Continuous Random Variables Multivariate Distributions Sums of Independent Random Variables Approximation and Limit Theorems in Probability Convergent Sequences and Approximation Modes of Convergence of Random Variables Laws of Large Numbers The Central Limit Theorem The Law of Rare Events Random Number Generation Pseudorandom Number Generators The Inversion Method Rejection Sampling Simulating Discrete Random Variables Discrete-time Markov Chains Definitions and Properties Asymptotic Behavior of Markov Chains Class Structure Hitting Times and Absorption Probabilities Stationary Distributions

3 CONTENTS 3 5 Biological Applications of Markov Chains The Wright-Fisher Model and its Relatives Cannings Models Galton-Watson Processes Chain Epidemic Models Epidemics with Household and Community Transmission Continuous-time Markov Chains Definitions and Properties Kolmogorov Equations Gillespie s Algorithm and the Jump Chain Stationary Distributions Time Reversal Poisson Processes and Measures Diffusions and Stochastic Calculus Brownian Motion The Invariance Principle Diffusion Approximations for CTMC s via the Heat Equation Properties of standard Brownian motion Diffusion Processes Diffusion Approximations Technical Interlude: Generators and Martingales Martingales

4 Chapter 1 Distributions, Expectations, and Random Variables 1.1 Probability Spaces We can think about probability in two ways: Frequentist interpretation: The probability of an event is the limiting frequency with which the event occurs when we conduct an infinite series of identical but independent trials. Subjective interpretation: The probability of an event measures the strength of our subjective belief that the event will occur in one trial. There has been much argument, especially amongst statisticians, as to which of these interpretations is correct. I tend to take a fairly pragmatic view of things and switch between these perspectives as best suits the problem that I am working on. However, we can at least write down a formal (and pretty much universally accepted) mathematical definition of probability. Definition 1.1. A probability space is a triple {Ω, F, P} where: Ω is the sample space, i.e., the set of all possible outcomes. F is a collection of subsets of Ω which we call events. required to satisfy the following conditions: F is called a σ-algebra and is 1. The empty set and the sample space are both events:, Ω F. 2. If E is an event, then its complement E c = Ω\E is also an event. 3. If E 1, E 2, are events, then their union n E n is an event. P is a function from F into [0, 1]: if E is an event, then P(E) is the probability of E. P is said to be a probability distribution or probability measure on F and is also required to satisfy several conditions: 1. P( ) = 0; P(Ω) = Countable additivity: If E 1, E 2, are mutually exclusive events, i.e., E i E j = whenever i j, then ( ) P E n = P(E n ). n=1 4 n=1

5 1.2. CONDITIONAL PROBABILITIES 5 If you want to read or publish articles on mathematical probability and statistics, then you will need to come to grips with this definition. David Williams little book, Probability with Martingales, provides an excellent introduction to this theory. In this course, we will usually be very informal and ignore the role played by the σ-algebra F. However, the properties described in the third part of the definition are both useful and intuitive: P( ) = 0 means that the probability that nothing (whatsoever) happens is zero. P(Ω) = 1 means that the probability that something (whatever it is) happens is one. If E 1 and E 2 are mutually exclusive events, then E 1 E 2 is the event that either E 1 or E 2 happens and the probability of that is just the sum of the probability that E 1 happens and the probability that E 2 happens: P (E 1 E 2 ) = P (E 1 ) + P (E 2 ). Countable additivity says that this property holds when we have a countable collection of disjoint events. The following lemma lists some other useful properties that can be deduced from Definition 1. Lemma 1.1. The following properties hold for any two events A, B in a probability space: 1. P(A c ) = 1 P(A). 2. If A and B are mutually exclusive, then P(A B) = 0 3. For any two events A and B (not necessarily mutually exclusive), we have: P(A B) = P(A) + P(B) P(A B) Exercise 1.1. Prove Lemma Conditional Probabilities It is often the case that we have some partial information about the outcome of an experiment or the state of an unknown system. Our next definition shows how we should modify our beliefs about the unobserved outcome given this additional information: Definition 1.2. Suppose that A and B are events and that P(B) > 0. Then the conditional probability that A occurs given that B occurs is P (A B) = P(A B) P(B) In frequentist terms, we can think of the conditional probability P(A B) as the fraction of trials resulting in both A and B divided by the fraction of trials resulting in B. In general, P(A B) P(A), in which case we say that B contains some information about A, i.e., knowing whether B does or does not occur gives us some information about whether A does or does not occur. On the other hand, if P(A B) = P(A), then B gives us with no information about A. This important scenario motivates the next definition.

6 6 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES Definition 1.3. Independent Events 1. Two events A and B are said to be independent if P(A B) = P(A) P(B). 2. A countable collection of events E 1, E 2, is said to be independent if for every finite subcollection E i1,, E in we have P (E i1 E in ) = P (E i1 ) P (E in ). Example 1.1. Three events A, B, and C are independent if all of the following identities hold: P ( A B C ) = P(A) P(B) P(C) P ( A B ) = P(A) P(B) P ( A C ) = P(A) P(C) P ( B C ) = P(B) P(C) Theorem 1.1. If A and B are independent and P(B) > 0, then P(A B) = P(A B) P(B) = P(A)P(B) P(B) = P(A). In other words, if A and B are independent, then, as we would expect, B gives us no information about A. Notice that the expression for conditional probability stated in Definition 2 can be rearranged to give P(A B) = P(A B) P(B), i.e., the probability that both A and B occur is equal to the conditional probability that A occurs given that B occurs times the probability that B occurs. Notice that, by symmetry, we also have P(A B) = P(B A) P(A). Although elementary, these simple algebraic manipulations lead to two of the most useful formulas in probability. The Law of Total Probability is important because it can often be used to compute the probability of a complicated event by conditioning on additional information. We will see many examples of this procedure throughout the course. Bayes formula is important, of course, because it forms the foundations of Bayesian statistics which we will also discuss at length in this course. Theorem 1.2. (Law of Total Probability) If A is an event and B 1,, B n is a collection of disjoint events such that A B 1 B n, then P(A) = P(A B 1 ) + + P(A B n ) = P(A B 1 ) P(B 1 ) + + P(A B n ) P(B n ). Theorem 1.3. (Bayes formula) If A and B are events with P(A) > 0 and P(B) > 0, then P(A B) = P(B A)P(A). P(B)

7 1.3. DISCRETE RANDOM VARIABLES Discrete Random Variables In practice, we are often unable to directly observe the state of the systems that we study in biology and instead must make do with indirect information provided by experiments. One way to model this situation mathematically is by identifying the probability space (Ω, F, P) with the true but unknown state of the system of interest and then introducing random variables that represent the outcomes of the experiments that we perform on that system. For example, if we perform just one experiment and if the set of possible outcomes is denoted E, then we would define a random variable X which is a function from Ω into E. Thus, if the state of the system is ω, then the result of our experiment will be the value X(ω). To be more concrete, suppose that we choose a saguaro cactus at random from Picacho Peak State Park and we then measure its height. In this case, the probability space could encode all of the processes (e.g., climatic and ecological) influencing the heights of the saguaros in the park as well as those influencing our sampling of an individual, while the random variable X will denote just the height of that individual which will be some value in the set E = [0, ). Remark 1.1. As promised, we are skirting over many formalities that are important if we want to prove theorems about random variables. In particular, to define random variables rigorously, we need to attach some additional structure to the set E and then require that the function X is measurable. For our purposes we can ignore these technical issues, but see Chapter 3 in Williams (1991) for the details. Definition 1.4. Suppose that (Ω, F, P) is a probability space and that X is a random variable that take values in the set E. Then the distribution of X is the probability distribution µ defined on E by the formula µ(a) P(X A) P ( {ω Ω : X(ω) A} ). Here A is a subset of E, i.e., A is a collection of possible outcomes for our experiment, whereas the set {ω Ω : X(ω) A} is a subset of Ω. Remark 1.2. Much of the time we will simply ignore the underlying probability space and restrict our attention to the distributions of the random variables defined on that space. In particular, we will usually just write X rather than X(ω) even when we have a particular value of X in mind. On the other hand, we will often be content to use the notation P(X A) rather than explicitly introduce the probability measure µ as we did in Definition 4. With practice, this shorthand will become very natural. Discrete random variables provide an important special case of these concepts. Definition A random variable X is said to be discrete if it takes values in a set E that is either finite or contains countably infinitely many points (e.g., the integers). 2. If E = {x 1, x 2, }, then the probability mass function of X is the function p : E [0, 1] defined by the formula p(x i ) = P(X = x i ). The probability mass function of a discrete random variable completely determines its distribution: P(X A) = x i A p(x i ).

8 8 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES In other words, to calculate the probability that X takes a value in a set A E, we simply need to sum the probability mass function of X over all of the points that belong to A. Notice that this implies that p(x i ) = P(X E) = 1, x i E since E is defined to be the set of all possible values that X can take. Definition 1.6. If X is a discrete random variable that takes values in a subset of the real numbers, then the expected value of X is defined to be the weighted average of these values EX = i p(x i ) x i. Remark 1.3. The expected value of a random variable is also called its expectation or its mean and is sometimes written as E[X] for clarity. In some respects, the name expected value is misleading since EX could well be a value that X never takes. For example, if E = {0, 1} and P(X = 0) = P(X = 1) = 1/2, then even though X is never equal to 1/2. EX = = 1 2, An important property of expectations is that they are linear: Theorem 1.4. (Linearity) Suppose that X and Y are discrete random variables and that a and b are real numbers. Then E [ a X + b Y ] = a EX + b EY. The next theorem describes another important property of expectations that is sometimes incorrectly stated as a definition, hence the tongue-in-cheek name: Theorem 1.5. (The Law of the Unconscious Statistician) If X is a discrete random variable with values in a set E and f : E R is a real-valued function, then f(x) is a discrete random variable and E [f(x)] = x i p(x i ) f(x i ) If you want a challenge, then try to do Exercise 1.2. Prove Theorems 4 and 5. Definition 1.7. The variance of a discrete real-valued random variable X is defined as Var(X) = E [ (X EX) 2]. Exercise 1.3. Use Theorems 4 and 5 to show that Var(X) = E [ X 2] (EX) 2.

9 1.3. DISCRETE RANDOM VARIABLES 9 The next four examples describe some of the more important discrete distributions that we will encounter this semester. In each case, E will denote the set of possible values of the random variable and p(x) will denote its probability mass function. Example 1.2. X is said to have a Bernoulli distribution with parameter p if E = {0, 1} and P(X = 1) = p; P(X = 0) = 1 p In this case the mean and variance of X are given by EX = p Var(X) = p(1 p). Bernoulli random variables are the simplest non-constant random variables and are often used to represent the success (1) or failure (0) of a random trial. Example 1.3. X is said to have a Binomial distribution with parameters n and p if X takes values in the set E = {0, 1,, n} with probability mass function ( ) n P(X = k) = p k (1 p) n k. k Recall that the binomial coefficient that appears in this definition is equal to ( ) n n! = k k!(n k)!, where n! = n(n 1)(n 2) 1, and counts the number of ways of choosing a subset of k objects from a collection of n objects. The mean and variance of X are given by EX = np Var(X) = np(1 p) Binomial distributions often arise when we carry out n independent but identical trials, each having probability p of success, and we count the total number of successes. Exercise 1.4. Suppose that X 1,, X n are independent, identically-distributed (abbreviated i.i.d.) Bernoulli random variables with parameter p and let X = X X n. Show that X is a Binomial random variable with parameters n and p. Example 1.4. X is said to have a geometric distribution with parameter p if X takes values in the non-negative integers E = {0, 1, } with probability mass function The mean and variance of X are given by P(X = k) = (1 p) k p. EX = 1 p p Var(X) = (1 p) p 2. Geometric distributions also arise when we carry out independent but identical trials. Let X 1, X 2, be an infinite collection of i.i.d. Bernoulli random variables, each with parameter p, and define X to be the number of failures that occur until the first success. Then X is a geometric random variable with parameter p.

10 10 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES Example 1.5. X is said to have a Poisson distribution with parameter λ if X takes values in the non-negative integers E = {0, 1, } with probability mass function The mean and variance of X are given by: λ λk P(X = k) = e k! EX = λ Var(X) = λ. Poisson distributions often arise in situations where a large number of independent trials are carried out and the probability of success of any one trial is small. We will discuss this in the next lecture when we consider the Law of Rare Events. 1.4 Continuous Random Variables Some the variables that we will be interested in take values in sets that are continuous, e.g., the height (in cm) of a randomly sampled individual could be regarded as a random variable that can assume any value between 0 and 300. In this case, the probability mass function is zero at every point and we need to describe the distribution of the random variable in a different way. Definition 1.8. A real-valued random variable X is said to be continuous if there is a nonnegative function p(x), called the probability density function of X, such that P(X A) = x p(x)dx, where A is a subset of R. In particular, by taking A = R, we see that A p(x)dx = P(X R) = 1, i.e., the density must integrate to 1 over the whole real line. Remark 1.4. If X is any real-valued random variable (not necessarily continuous), then the distribution of X is completely determined by its cumulative distribution function (often abbreviated c.d.f.) F (x) = P(X x). Notice that F (x) is an increasing function of x, i.e., if x < y, then F (x) F (y). Also, lim F (x) = P(X ) = 0 lim x F (x) = P(X ) = 1. x If X is also continuous, then the cumulative distribution function F (x) and the density function p(x) are related in the following way F (x) = x p(y)dy and p(x) = F (x), i.e., the density is just the derivative of the cumulative distribution function. Furthermore, we can estimate the density of X at a value x using the approximate formula, p(x) P (x ɛ < X x + ɛ) 2ɛ where ɛ > 0 is any small positive number. = F (x + ɛ) F (x ɛ), 2ɛ

11 1.4. CONTINUOUS RANDOM VARIABLES 11 Remark 1.5. An important distinction between probabilities and probability densities is that whereas the probability of any event is a number between 0 and 1, the probability density p(x) may be greater than one (in fact, it can be infinite). In general, many of the definitions and results that hold for discrete random variables also hold for continuous random variables provided that we replace the probability mass function by the probability density function and we replace sums by integrals. Definition 1.9. If X is a continuous random variable with density p(x), then the expected value of X is the weighted average EX = p(x) xdx. Also, as in the discrete case, the variance of X is defined to be Var(X) = E [ (X EX) 2]. Theorem 1.6. Suppose that X and Y are continuous random variables and that a and b are real numbers. Then E[a X + b Y ] = a EX + b EY. Theorem 1.7. (The Law of the Unconscious Statistician) If X is a continous real-valued random variable and f : R R is a real-valued function, then E [f(x)] = f(x) p(x)dx Some important classes of continuous random variables are described below. Example 1.6. X is said to be uniformly distributed on the interval [a, b] if it has density p(x) = 1 b a if x [a, b] 0 if x < a or x > b. In this case, the mean and variance of X are given by EX = a + b 2 ; Var(X) = 1 12 (b a)2. In addition, if [a, b] = [0, 1], then X is said to be a standard uniform random variable. Example 1.7. X is said to have the exponential distribution with rate parameter λ > 0 if it has density λe λx if x 0 p(x) = 0 if x < 0. In this case, the mean and variance of X are given by EX = 1 λ ; Var(X) = 1 λ 2,

12 12 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES and so the mean is equal to the reciprocal of the rate. We can also explicitly calculate the cumulative distribution function of X: if t 0, then P{X t} = t 0 λe λx dx = 1 e λt. Exponential random variables are often used to model the times between random events when the rates at which these events occur do not change over time. For example, if we assume that the mutation rate at a particular site in the genome of a species of interest is constant, then the time between successive mutations at that site will be exponentially distributed. This follows from the fact that every exponential distribution is memoryless, i.e., P{X > t + s X > t} = P{X > t + s} P{X > t} = e (t+s) e t = e s = P{X > s}. In other words, if we think of X as the lifespan of an individual (measured in years, say), then this equation says that the conditional probability that the individual will survive for another s years given that they have already survived for t years is the same as the unconditional probability that they will survive for at least s years. It is as if upon surviving for t years, the clock governing their lifespan begins anew with the same exponential distribution. Curiously, this property also characterizes exponential distributions: if X is a random variable and the identity P{X > t + s X > t} = P{X > t} holds for all real numbers s, t > 0, then it can be shown that X is an exponential random variable. This observation will play a central role in our study of continuous-time Markov chains later in the semester. Example 1.8. X is said to have the gamma distribution with shape parameter α > 0 and scale parameter θ > 0 if it has density 1 Γ(α)θ x α 1 e x/θ if x 0 α p(x) = 0 if x < 0, where Γ is the so-called gamma function defined by Γ(α) = In this case, the mean and variance of X are given by 0 x α 1 e x dx. EX = αθ; Var(X) = αθ 2. If α = 1, then X is an exponential random variable with rate parameter θ 1. If X 1,, X n are independent exponentially distributed random variables, each with rate parameter λ, then their sum X = X X n is a gamma random variable with shape parameter α = n and scale parameter θ = λ 1. Thus, gamma random variables are often used to model the durations of processes that last until a series of independent events has occurred, e.g., the time to oncogenic transformation of a cell in which n independent mutations must occur to compromise regulation of the cell cycle.

13 1.5. MULTIVARIATE DISTRIBUTIONS 13 Example 1.9. X is said to have the normal distribution (also called the Gaussian distribution) with mean µ and variance σ 2 > 0 if it has density: p(x) = 1 2πσ 2 e (x µ)2 /2σ 2. If, in addition, µ = 0 and σ 2 = 1, then X is said to be standard normal random variable. One useful property of normal random variables is that if Z is a standard normal random variable, then the variable X = µ + σz is normally distributed with mean µ and variance σ 2. Normal distributions are ubiquitous in nature, which is why they are called normal and why so much statistical machinery has been developed under the assumption that the data being analyzed is normally distributed. In the next lecture, we will see that this is at least partly explained by the Central Limit Theorem. 1.5 Multivariate Distributions Suppose that we have carried out several experiments on a system of interest and that we let X i be a random variable that represents the outcome of the i th experiment. In this situation, it may be useful to consider all of the experiments together, which we can do by introducing a single vector-valued random variable that takes values in the product space X = (X 1,, X n ) E = E 1 E n = { (x 1,, x n ) : x i E i } where E i is the set of all possible outcomes of the i th experiment. For example, if we are working at a bird banding station, then we might collect three pieces of information on each bird captured by the nets, so that X 1 denotes the sex of the bird with values in the set E 1 = {m, f}, X 2 denotes the body mass of the bird with values in the set E 2 = [0, ), and X 3 denotes the number of ectoparasites on the bird with values in the set E 3 = {0, 1, 2, }. Although we may be interested in each of these variables in its own right, we are likely to learn much more about the birds in the population by considering the random vector containing all of our data on each individual: X = (X 1, X 2, X 3 ) = (sex, mass, parasite load). Definition Suppose that X 1,, X n are random variables defined on the same probability space that take values in the sets E 1,, E n. Then the joint distribution of X 1,, X n is defined to be the distribution of the random vector X = (X 1,, X n ) with values in E = E 1 E n, i.e., P {X A} = P {(X 1,, X n ) A}, where A is a subset of E. In this case, the distribution of any one of the variables, say X i, considered on its own, is said to be the marginal distribution of that variable. Remark 1.6. Although we can always recover the marginal distributions of a collection of random variables from their joint distribution, it is not possible to deduce the latter from the former unless we are given some additional information. In fact, usually there will be infinitely many

14 14 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES ways of assigning a joint distribution that is compatible with any particular set of marginals. On the other hand, an important case in which the joint distribution of a collection of random variables is uniquely determined by the marginal distributions is when the random variables are independent of one another. Definition Independent Random Variables 1. Random variables X 1,, X n are said to be independent if ) P (X 1 E 1,, X n E n = n P ( ) X i E i for all sets E 1,, E n such that the events {X i E i } are well-defined. 2. An infinite collection of random variables is said to be independent if every finite subcollection is independent according to 1. i=1 The next theorem states two useful facts: (i) functions of independent random variables are themselves independent; and (ii) the expected value of a product of independent random variables is equal to the product of the expected values of the individual variables. Theorem 1.8. Suppose that X 1,, X n are independent real-valued random variables and that f 1,, f n are functions from R to R. Then f 1 (X 1 ),, f n (X n ) are independent real-valued random variables and [ n ] n E f i (X i ) = E [f i (X i )] i=1 i=1 whenever the expectations on both sides of the equation are defined. An important special case is when the random variables X 1,, X n are discrete. In this case, the random vector X = (X 1,, X n ) is itself a discrete random variable (since the product space E = E 1 E n is countable) and the joint probability mass function of the variables X 1,, X n is defined by the formula p(x 1,, x n ) = P {X = (x 1,, x n )} = P {X 1 = x 1,, X n = x n }. If the variables are independent and if we let p i (x) = P{X i = x} be the (marginal) probability mass function of X i, then we can use Definition 1.11 to show that the joint probability mass function is equal to the product of the marginal probability mass functions p(x 1,, x n ) = p 1 (x 1 ) p n (x n ). As hinted at above, we can find the marginal probability mass functions p i (x i ) from the joint probability mass function p(x) even if the variables are not independent. This process is called marginalization and is given by the following formula p i (y) = p(x). x E:x i =y Here y is a point in E i and {x E : x i = y} is the set of all vectors x = (x 1,, x n ) in E for which the i th coordinate x i equals y.

15 1.5. MULTIVARIATE DISTRIBUTIONS 15 One of the most commonly encountered discrete multivariate distributions is the multinomial distribution. Definition Let n 1 be a positive integer and let (p 1,, p k ) be a collection of positive real numbers such that p p k = 1. We say that the random vector X = (X 1,, X k ) has the multinomial distribution with parameters n and (p 1,, p k ) if each of the variables X i takes values in the set {0,, n} and if the joint probability mass function of these variables is given by ( ) n p(n 1,, n k ) = p n 1 1 n 1,, n pn k k. k Recall that the multinomial coefficient ( ) n = n 1,, n k n! n 1! n k! is the number of ways of partitioning a collection of n elements into k disjoint subsets such that the first subset contains n 1 elements, the second subset contains n 2 elements, etc. In particular, this coefficient is zero if the sum n n k is not equal to n. It follows that the sum of the components of X is equal to n, X X k = n, which shows that the variables X 1,, X k are not independent. Multinomial distributions arise in the following way. Suppose that we conduct n independent but identical trials, that each trial can result in one of k possible outcomes, and that the probability that any one trial results in the i th outcome is p i. If we let X i denote the number of trials that result in the i th outcome, then (X 1,, X k ) has the multinomial distribution with parameters n and (p 1,, p k ). We will also work with continuous multivariate distributions. Definition A random vector X = (X 1,, X n ) with values in R n is said to be continuously distributed if there is a non-negative function p : R n [0, ) with the property that P{X A} = p(x)dx, A where A is a subset of R n. The function p is said to be the joint probability density function of the variables X 1,, X n. In this case, each of the variables X i is individually continuous and the marginal density function p i () of X i can be recovered from the joint density function by integration p i (y) = p(x)dx. x:x i =y Here {x : x i = y} is the set of n-dimensional vectors x = (x 1,, x n ) whose i th coordinate x i is equal to y. If we know that the variables are independent, then we can show that the joint density function is equal to the product of the marginal density functions p((x 1,, x n )) = p 1 (x 1 ) p n (x n ), and, in fact, this identity also implies independence.

16 16 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES Before we give an example of a continuous multivariate distribution, we need two more definitions. Definition Suppose that X and Y are real-valued random variables (either discrete or continuous). The covariance of X and Y is defined to be Cov(X, Y ) = E [(X EX)(Y EY )]. If Cov(X, Y ) = 0, then we say that X and Y are uncorrelated. Exercise Show that Cov(X, Y ) = Cov(Y, X). 2. Show that Cov(X, Y ) = E[XY ] (EX)(EY ). 3. Show that any two independent random variables are uncorrelated. 4. Give a counterexample to show that uncorrelated random variables are not necessarily independent. If we have more than two random variables, then there are many covariances to be tracked. The next definition describes a convenient way of organizing this information. Definition Suppose that X 1,, X n are real-valued random variables (either discrete or continuous) and let σ ii = Var(X i ) denote the variance of X i and σ ij = Cov(X i, X j ) denote the covariance of X i and X j. Then the variance-covariance matrix of the random vector X = (X 1,, X n ) is the n n matrix Σ with entry σ ij in the i th column and j th row: Σ = σ 11 σ 12 σ 1n σ 21 σ 22 σ 2n... σ n1 σ n2 σ nn. Finally we come to the promised example. Definition A continuous random vector X = (X 1,, X n ) with values in R n is said to have the multivariate normal distribution with mean vector µ = (µ 1,, µ n ) and n n variance-covariance matrix Σ if it has joint density function { p(x) = (2π) n/2 Σ 1/2 exp 1 } 2 (x µ)t Σ 1 (x µ). In this formula, Σ denotes the determinant of the matrix Σ and Σ 1 denotes its matrix inverse, i.e., Σ 1 is the unique n n matrix such that ΣΣ 1 = Σ 1 Σ = I n, where I n is the n n identity matrix, i.e., all of the diagonal elements of I equal 1 and all of the off-diagonal elements equal 0.

17 1.6. SUMS OF INDEPENDENT RANDOM VARIABLES 17 Theorem 1.9. If X = (X 1,, X n ) is a multivariate normal random vector with mean vector µ = (µ 1,, µ n ) and variance-covariance matrix Σ, and b = (b 1,, b n ) R n is an n- dimensional vector, then the variable Z = b X = b 1 X b n X n is a real-valued normal random variable with mean b µ = b 1 µ b n µ n and variance b T Σb = n n σ ij b i b j. i=1 j=1 1.6 Sums of Independent Random Variables Many problems in applied probability involve sums of independent random variables. For example, in the next chapter, we will review some of the classical limit theorems of probability that arise when large numbers of independent random variables are added together. In preparation, here we will see how we can find the distribution of the sum of two independent random variables. We first consider the discrete case. Lemma 1.2. Let X and Y be independent integer-valued random variables with probability mass functions p X (n) and p Y (n). Then the probability mass function of the variable X + Y is p X+Y (n) P{X + Y = n} ( ) = P {X = m, Y = n m} = = = m= m= m= m= P {X = m, Y = n m} P{X = m} P{Y = n m} p X (m) p Y (n m) p X p Y (n). all of the ways they can sum to n since these are disjoint events since X, Y are independent The quantity p X p Y (n) is sometimes called the discrete convolution of p X with p Y. Notice that it is commutative, i.e., p X p Y (n) = p Y p X (n). Example Suppose that X and Y are independent Poisson random variables with parameters λ and µ, respectively. Then the probability mass function of X + Y is n λ λm µn m p X+Y (n) = e e µ m! (n m)! m=0 = e (λ+µ) 1 n n! n! m!(n m)! λm µ n m m=0 (λ+µ) (λ + µ)n = e, n! where we have used the binomial theorem to pass from the second line to the third. Looking at the result, we see that X + Y is itself a Poisson random variable with parameter λ + µ and so we

18 18 CHAPTER 1. DISTRIBUTIONS, EXPECTATIONS, AND RANDOM VARIABLES have proved the following important fact: The sum of two independent Poisson random variables X and Y is a Poisson random variable with parameter equal to the sum of the parameters of X and Y. Exercise 1.6. Let X and Y be independent binomial random variables with parameters (n, p) and (m, p) respectively. Show that X + Y is a binomial random variable with parameters (n + m, p). There are similar results for sums of independent continuous random variables. Lemma 1.3. Let X and Y be independent continuous random variables with densities p X (x) and p Y (x), respectively. Then X + Y is a continuous random variable with density p X+Y (z) = p X (t)p Y (z t)dt p X p Y (z), and p X p Y (t) is called the convolution of p X and p Y. Although Lemmas 1.2 and 1.3 give explicit expressions for the probability mass function and probability density function of a sum of two independent random variables, the sums and integrals that arise in these expressions can be difficult to evaluate. In some cases, the distribution of the sum can be more easily found by considering either the probability generating function or the moment generating function of the variables. Definition Let X be a non-negative integer-valued random variable with distribution p n = P{X = n}. Then the probability generating function is the function ψ X : [0, 1] [0, 1] defined by ψ X (s) E [ s X] = p n s n. The most important property of the probability generating function (p.g.f.) of a random variable is that it completely determines the distribution of that variable, i.e., if X and Y have the same p.g.f., then they have the same distribution. The second most important property is given in the next lemma. n=0 Lemma 1.4. Suppose that X and Y are independent non-negative integer-valued random variables with probability generating functions ψ X (s) and ψ Y (s). Then the probability generating function of the sum X + Y is the product ψ X (s)ψ Y (s): ψ X+Y (s) E [ s X+Y ] = E [ s X s Y ] = E [ s X] E [ s Y ] = ψ X (s) ψ Y (s). Example Let X and Y be the independent Poisson random variables introduced in Example We first calculate the p.g.f. of X: λ λn ψ X (s) = e n! sn n=0 = e λ (λ s) n n! n=0 = e λ e λ s = e λ(s 1).

19 1.6. SUMS OF INDEPENDENT RANDOM VARIABLES 19 A similar calculation shows that ψ Y (s) = e µ(s 1) and so the p.g.f. of the sum X + Y is ψ X+Y (s) = ψ X (s) ψ Y (s) = e λ(s 1) e µ(s 1) = e (λ+µ)(s 1). Since this is also the p.g.f. of a Poisson random variable with parameter λ + µ, it follows that this is exactly the distribution of X + Y. A different kind of generating function is needed to handle real-valued random variables that aren t necessarily integer-valued. Definition Let X be a real-valued random variable. Then the moment generating function is the function M X : R R defined by M X (t) E [ e tx] x p(x)etx if X is discrete with p.m.f. p(x) = p(x)etx dx if X is continuous with density p(x). In general, the moment generating function (m.g.f.) of an arbitrary random variable X may be infinite for some values of t. However, if it is defined on an interval containing 0, then it uniquely determines the distribution of the random variable X. In other words, if X and Y have the same m.g.f. and if this function is defined on some interval ( a, b), then we can conclude that X and Y have the same distribution. Furthermore, the following counterpart to Lemma 1.4 holds for the moment generating function of a sum of independent variables. Lemma 1.5. If X and Y are independent real-valued random variables with moment generating functions M X (t) and M Y (t), then the moment generating function of their sum X + Y is the product M X (t) M Y (t): t(x+y M X+Y (t) = E [e )] = E [ e tx e ty ] = E [ e tx] E [ e ty ] = M X (t) M Y (t). Example Let X and Y be independent normal random variables with mean and variance equal to µ X and σx 2 and µ Y and σy 2, respectively. A somewhat tedious calculation shows that the m.g.f. of X and Y are M X (t) = exp {µ X t 12 } σ2xt 2 and M Y (t) = exp {µ Y t 12 } σ2y t 2, and then Lemma 1.5 tells us that the m.g.f. of the sum X + Y is { M X+Y (t) = M X (t) M Y (t) = exp (µ X + µ Y )t 1 } ( σ 2 2 X + σy 2 ) t 2. However, since M X+Y (t) is also the m.g.f. of a normal random variable with mean µ X + µ Y and variance σx 2 + σ2 Y, it follows that this is the distribution of X + Y, i.e., we have shown that the sum of two independent normally distributed random variables is also normally distributed.

20 Chapter 2 Approximation and Limit Theorems in Probability 2.1 Convergent Sequences and Approximation A recurrent theme in applied mathematics is the approximation of a complicated object, be this a number, a system of differential equations, or a stochastic process, by a simpler one. Often this is done heuristically: for example, we may formulate a less complicated model by simply omitting certain details that we believe (or hope) are unimportant. However, in some settings, approximation can be done more rigorously by working with convergent sequences of objects. Before examining how this applies to random variables and distributions, we first recall what it means for a sequence of real numbers to converge to a limit. Definition 2.1. A sequence of real numbers x 1, x 2, is said to converge to a limit x if for every positive real number ɛ > 0, we can find an integer N such that the difference x n x is less than ɛ whenever n N. When this is true, we write x n x or, more formally, x = lim n x n. The intuition behind this definition is that as n increases, the terms x n in a convergent sequence should approach the limit arbitrarily closely and then remain close to that limit. Example 2.1. If x n = n 1 n, then the sequence x 1, x 2, converges to the limit x = 1. Indeed, if ɛ > 0 is a positive real number and we take N to be any positive integer greater than 1/ɛ, then for any integer n N, we have x x n = 1 n 1 n = 1 n 1 N < ɛ. Convergent sequences can be used to formulate approximations in two fundamentally different ways. On the one hand, we can sometimes approximate a complicated object x by the terms in a sequence of simpler objects x n that converge to x. In fact, we do this whenever when we use a truncated decimal expansion to approximate an irrational number. Example 2.2. The number π = can be approximated by the terms in the sequence x 1 = 3, x 2 = 3.1, x 3 = 3.14, etc. On the other hand, if the complicated object is itself a term in a sequence, say x n, and that sequence converges to a limit x that is more convenient to work with (e.g., easier to simulate), 20

21 2.1. CONVERGENT SEQUENCES AND APPROXIMATION 21 then we may choose to approximate x n by x, at least when n is sufficiently large. This is illustrated in the next example, which shows how we can approximate a geometrically distributed random variable by one that is exponentially distributed. Example 2.3. For each integer n 1, let X n be a geometrically distributed random variable with success probability λ/n. Since EX n = n/λ, we can expect X n to be large whenever n is large, in which case it will also be expensive to simulate. To find an approximation for X n, we first observe that if t is a non-negative integer, then ( P{X n > t} = 1 n) λ t. (2.1) This result can be derived in two ways. Either we can evaluate the following infinite series ( λ P{X n > t} = P{X n = k} = 1 λ k, n n) k=t+1 k=t+1 or we can recall Example 1.4 and note that the probability of the event X n > t is the same as the probability that there are no successes in a series of t independent Bernoulli trials, each of which has success probability λ/n. Expression (2.1) is useful because of the following important result that you may recall from a calculus class: lim n ( 1 a n) nγ = e γa. It follows that if we replace t by nτ in (2.1), then ( lim P{X n > nτ} = lim 1 λ ) nτ n n n = e λτ. This limit is interesting because if X is an exponentially-distributed random variable with rate parameter λ, then P{X > τ} = e λτ and so we have shown that lim P{X n > nτ} = P{X > τ}. n Also, since P{Y > t} = 1 P{Y t} for any random variable Y, it follows that which can be rewritten as lim P{X n nτ} = P{X τ}, n { } 1 lim P n n X n τ = P{X τ}. (2.2) Since this last result holds for every non-negative real number τ 0, we have shown that whenever n is large, the distribution of the random variable 1 n X n can be approximated by an exponential distribution with rate parameter λ. Later in the course we will see how this result can be used to approximate a discrete-time Markov chain by a continuous-time Markov chain. Remark 2.1. Example 2.3 illustrates another important theme in stochastic modeling, which is that is that it is often necessary to normalize random variables when we want to pass to a limit. In this particular example, the normalization is suggested by the fact that although the expected values of the unnormalized variable X n diverge as n tends to infinity, the expectations of the normalized variables X n /n are all constant: [ ] 1 E n X n = λ.

22 22 CHAPTER 2. APPROXIMATION AND LIMIT THEOREMS IN PROBABILITY 2.2 Modes of Convergence of Random Variables As rich and as complicated as the real numbers are, real-valued random variables have an even richer and more complicated structure. One illustration of this difference is that there are several different senses (or modes) in which a sequence of real-valued random variables X 1, X 2, can converge to a limiting random variable X. In this course, we will mostly ignore the many technical issues that this raises and operate with an intuitive sense of what convergence should mean. Indeed, most consumers of stochastic approximations are largely unaware of these issues. However, it will be useful to at least be familiar some of the jargon. We will begin by assuming that X 1, X 2, and X are all real-valued random variables. Definition 2.2. The sequence (X n ; n 1) is said to converge in probability to X if for every ɛ > 0, lim n P{ X n X > ɛ} = 0. In other words, to say that the sequence converges to X in probability means that the probability that X n and X differ by some fixed amount ɛ can be made arbitrarily small by taking n sufficiently large. Notice that this definition implicitly assumes that X n and X are both defined on the same probability space. Definition 2.3. The sequence (X n ; n 1) is said to converge almost surely to X if { } P lim X n = X = 1. n It follows from Definition 2.1 that the sequence of random variables X n converges to X almost surely if for every positive real number ɛ > 0, there is an integer-valued random variable N such that P { X n X < ɛ for all n N} = 1. Once again, this definition assumes that all of the variables are defined on the same probability space. However, there are important differences between convergence in probability and almost sure convergence. In particular, whereas convergence in probability only requires that we compare each X n with X one at a time, almost-sure convergence requires that we simultaneously compare all of the variables X n with n N with X. For this reason, almost-sure convergence is a much stronger mode of convergence than convergence in probability, i.e., almost-sure convergence implies convergence in probability, but the converse is not true. Remark 2.2. We say that an event E occurs almost surely if P(E) = 1. This is often abbreviated by writing E occurs a.s. Since non-empty sets can have probability 0, it is important to realize that saying that E occurs almost surely is not the same thing as saying that E must happen. For example, if X is standard uniform random variable, then the event E = {X is irrational } occurs a.s., since P(E c ) = P{X is rational } = 0, but of course X could be rational. Although these statements may seem counter-intuitive, they reflect the technical complications that arise when we deal with probabilities on sets containing uncountably many objects (e.g., the real numbers). Definition 2.4. The sequence (X n ; n 1) is said to converge in distribution to X if lim P{X n x} = P{X x} n for every real number x with the property that P{X = x} = 0.

23 2.3. LAWS OF LARGE NUMBERS 23 Convergence in distribution is also called weak convergence and is indeed much weaker than the other two modes of convergence introduced above, i.e., almost-sure convergence and convergence in probability each imply convergence in distribution, but neither is implied by convergence in distribution. Furthermore, we can talk about convergence in distribution even when all of the random variables are defined on distinct probability spaces: only the distributions of these variables appear in the definition. Remark 2.3. In Example 2.3, we showed that the sequence of normalized geometric random variables 1 n X n converges in distribution to an exponential random variable X with parameter λ. See equation (2.2). Once again, Williams (1991) provides an excellent introduction to the technical issues - see Chapter A13 for the key facts. Also, although I have focused on real-valued random variables above, each of these definitions can be extended to make sense of what it means for a sequence of E-valued random variables to converge to a limiting random variable where E is any other space that we are likely to encounter (e.g., a set of vectors or matrices or even functions). These extensions become important when we wish to talk about the convergence not only of sequences of random variables, but also of sequences of stochastic processes. I ll briefly discuss some of these issues when we introduce diffusion approximations, but for the most part we can ignore them in this course. The book Markov Processes: Characterization and Convergence by Stewart Ethier and Tom Kurtz (1986) provides one of the best introductions to this subject for readers with a solid grasp of real and functional analysis. Having introduced the key definitions, we now briefly examine some of the main limit theorems in probability. 2.3 Laws of Large Numbers Recall that the frequentist interpretation of probability is that the probability of an event is the limiting frequency with which that event occurs when we conduct an infinite series of independent but identical trials. The weak and the strong laws of large numbers show that the intuition behind this interpretation is at least mathematically sound. Recall that we use the letters i.i.d. as an abbreviation for the phrase independent and identically distributed. Theorem 2.1. Weak Law of Large Numbers (WLLN) Suppose that X 1, X 2, are i.i.d. random variables and that E X 1 <. If µ = EX 1 denotes the expected value of X 1 and S n = X X n is the sum of the first n variables, then for every ɛ > 0, { } lim P 1 n n S n µ > ɛ = 0. In the terminology of the previous section, the WLLN asserts that the sequence of normalized partial sums S n /n converges in probability to the constant µ. In practical terms, this theorem tells us that if we conduct a large number of i.i.d. trials and calculate the sample mean of the outcomes, then with high probability this will be close to the expected value of the system. Theorem 2.2. Strong Law of Large Numbers (SLLN) Suppose that X 1, X 2, are i.i.d. and that E X 1 <. If µ = EX 1 denotes the expected value of X 1 and S n = X X n is

24 24 CHAPTER 2. APPROXIMATION AND LIMIT THEOREMS IN PROBABILITY the sum of the first n variables, then { } 1 P lim n n S n = µ = 1. Thus, the SLLN asserts that the sequence S n /n in fact converges almost surely to the constant µ. The connection with the frequentist interpretation of probability is given in the following example. Example 2.4. Suppose that we are interested in estimating the probability of an event A and that we are able to conduct an unlimited number of i.i.d. trials to do so. Let X i be the Bernoulli random variable defined by setting X i = 1 if A occurs on the i th trial and X i = 0 otherwise. Then X 1, X 2, is a sequence of i.i.d. random variables and µ = EX 1 = P{X 1 = 1} 1 + P{X 1 = 0} 0 = P(A). Also, since S n /n is the proportion of the first n trials that resulted in A, the SLLN tells us that the limiting frequency of the event A converges almost surely to the probability P(A), as we expect. Remark 2.4. The weak and the strong law of large numbers are important special cases of a more general heuristic which states that deterministic behavior can emerge in certain kinds of stochastic models if these consist of a large number of weakly interacting (or indeed independent) components. For example, we can derive many of the deterministic models studied in epidemiology (e.g., the simple SIR model) by starting with a stochastic model of an epidemic in a finite population with N individuals and then letting N tend to infinity. These kinds of limits are sometimes called hydrodynamic limits by analogy with the deterministic models that describe the dynamics of fluids containing large numbers of weakly interacting molecules. 2.4 The Central Limit Theorem Even if a large system behaves almost deterministically, we may still be interested in the random fluctuations of the system about its deterministic limit. The next theorem addresses this issue in the context of sums of independent random variables. Theorem 2.3. Central Limit Theorem (CLT) Suppose that X 1, X 2, are i.i.d. random variables and that both the mean µ = EX 1 and the variance σ 2 = Var(X 1 ) of X 1 are finite. If we let S n = X X n and Z n = 1 σ n (S n nµ), then the sequence Z 1, Z 2, converges in distribution to a standard normal random variable Z, i.e., for every real number x, lim P {Z n x} = P {Z x} = 1 x e z2 /2 dz. n 2π Here is one way to think about the CLT. By the SLLN, we know that the sequence of empirical means S n /n converges almost surely to µ, i.e., for all sufficiently large n, the differences n S n /n µ should be small. Nonetheless, because n is a random variable, it may be useful to

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65

APM 504: Probability Notes. Jay Taylor Spring Jay Taylor (ASU) APM 504 Fall / 65 APM 504: Probability Notes Jay Taylor Spring 2015 Jay Taylor (ASU) APM 504 Fall 2013 1 / 65 Outline Outline 1 Probability and Uncertainty 2 Random Variables Discrete Distributions Continuous Distributions

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

APM 541: Stochastic Modelling in Biology Probability Notes. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 77

APM 541: Stochastic Modelling in Biology Probability Notes. Jay Taylor Fall Jay Taylor (ASU) APM 541 Fall / 77 APM 541: Stochastic Modelling in Biology Probability Notes Jay Taylor Fall 2013 Jay Taylor (ASU) APM 541 Fall 2013 1 / 77 Outline Outline 1 Motivation 2 Probability and Uncertainty 3 Conditional Probability

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Lecture 11. Probability Theory: an Overveiw

Lecture 11. Probability Theory: an Overveiw Math 408 - Mathematical Statistics Lecture 11. Probability Theory: an Overveiw February 11, 2013 Konstantin Zuev (USC) Math 408, Lecture 11 February 11, 2013 1 / 24 The starting point in developing the

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Lecture 2: Review of Basic Probability Theory

Lecture 2: Review of Basic Probability Theory ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

University of Regina. Lecture Notes. Michael Kozdron

University of Regina. Lecture Notes. Michael Kozdron University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

6.1 Moment Generating and Characteristic Functions

6.1 Moment Generating and Characteristic Functions Chapter 6 Limit Theorems The power statistics can mostly be seen when there is a large collection of data points and we are interested in understanding the macro state of the system, e.g., the average,

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

2.1 Elementary probability; random sampling

2.1 Elementary probability; random sampling Chapter 2 Probability Theory Chapter 2 outlines the probability theory necessary to understand this text. It is meant as a refresher for students who need review and as a reference for concepts and theorems

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Theorems Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

IEOR 4701: Stochastic Models in Financial Engineering. Summer 2007, Professor Whitt. SOLUTIONS to Homework Assignment 9: Brownian motion

IEOR 4701: Stochastic Models in Financial Engineering. Summer 2007, Professor Whitt. SOLUTIONS to Homework Assignment 9: Brownian motion IEOR 471: Stochastic Models in Financial Engineering Summer 27, Professor Whitt SOLUTIONS to Homework Assignment 9: Brownian motion In Ross, read Sections 1.1-1.3 and 1.6. (The total required reading there

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3 Probability Paul Schrimpf January 23, 2018 Contents 1 Definitions 2 2 Properties 3 3 Random variables 4 3.1 Discrete........................................... 4 3.2 Continuous.........................................

More information

Tom Salisbury

Tom Salisbury MATH 2030 3.00MW Elementary Probability Course Notes Part V: Independence of Random Variables, Law of Large Numbers, Central Limit Theorem, Poisson distribution Geometric & Exponential distributions Tom

More information

Lecture 1: Review on Probability and Statistics

Lecture 1: Review on Probability and Statistics STAT 516: Stochastic Modeling of Scientific Data Autumn 2018 Instructor: Yen-Chi Chen Lecture 1: Review on Probability and Statistics These notes are partially based on those of Mathias Drton. 1.1 Motivating

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS

POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying

More information

Lecture 22: Variance and Covariance

Lecture 22: Variance and Covariance EE5110 : Probability Foundations for Electrical Engineers July-November 2015 Lecture 22: Variance and Covariance Lecturer: Dr. Krishna Jagannathan Scribes: R.Ravi Kiran In this lecture we will introduce

More information

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events.

We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. 1 Probability 1.1 Probability spaces We will briefly look at the definition of a probability space, probability measures, conditional probability and independence of probability events. Definition 1.1.

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

THE QUEEN S UNIVERSITY OF BELFAST

THE QUEEN S UNIVERSITY OF BELFAST THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M

More information

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y. CS450 Final Review Problems Fall 08 Solutions or worked answers provided Problems -6 are based on the midterm review Identical problems are marked recap] Please consult previous recitations and textbook

More information

STAT Chapter 5 Continuous Distributions

STAT Chapter 5 Continuous Distributions STAT 270 - Chapter 5 Continuous Distributions June 27, 2012 Shirin Golchi () STAT270 June 27, 2012 1 / 59 Continuous rv s Definition: X is a continuous rv if it takes values in an interval, i.e., range

More information

7 Random samples and sampling distributions

7 Random samples and sampling distributions 7 Random samples and sampling distributions 7.1 Introduction - random samples We will use the term experiment in a very general way to refer to some process, procedure or natural phenomena that produces

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley Midterm 2: Format 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Disjointness and Additivity

Disjointness and Additivity Midterm 2: Format Midterm 2 Review CS70 Summer 2016 - Lecture 6D David Dinh 28 July 2016 UC Berkeley 8 questions, 190 points, 110 minutes (same as MT1). Two pages (one double-sided sheet) of handwritten

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

1 Sequences of events and their limits

1 Sequences of events and their limits O.H. Probability II (MATH 2647 M15 1 Sequences of events and their limits 1.1 Monotone sequences of events Sequences of events arise naturally when a probabilistic experiment is repeated many times. For

More information

JUSTIN HARTMANN. F n Σ.

JUSTIN HARTMANN. F n Σ. BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation

More information

Introduction to Stochastic Processes

Introduction to Stochastic Processes Stat251/551 (Spring 2017) Stochastic Processes Lecture: 1 Introduction to Stochastic Processes Lecturer: Sahand Negahban Scribe: Sahand Negahban 1 Organization Issues We will use canvas as the course webpage.

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 20, 2012 Introduction Introduction The world of the model-builder

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

FE 5204 Stochastic Differential Equations

FE 5204 Stochastic Differential Equations Instructor: Jim Zhu e-mail:zhu@wmich.edu http://homepages.wmich.edu/ zhu/ January 20, 2009 Preliminaries for dealing with continuous random processes. Brownian motions. Our main reference for this lecture

More information

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be Chapter 6 Expectation and Conditional Expectation Lectures 24-30 In this chapter, we introduce expected value or the mean of a random variable. First we define expectation for discrete random variables

More information

1 Review of Probability

1 Review of Probability 1 Review of Probability Random variables are denoted by X, Y, Z, etc. The cumulative distribution function (c.d.f.) of a random variable X is denoted by F (x) = P (X x), < x

More information

Introduction to Probability

Introduction to Probability LECTURE NOTES Course 6.041-6.431 M.I.T. FALL 2000 Introduction to Probability Dimitri P. Bertsekas and John N. Tsitsiklis Professors of Electrical Engineering and Computer Science Massachusetts Institute

More information

3. Review of Probability and Statistics

3. Review of Probability and Statistics 3. Review of Probability and Statistics ECE 830, Spring 2014 Probabilistic models will be used throughout the course to represent noise, errors, and uncertainty in signal processing problems. This lecture

More information

Lecture 2: Review of Probability

Lecture 2: Review of Probability Lecture 2: Review of Probability Zheng Tian Contents 1 Random Variables and Probability Distributions 2 1.1 Defining probabilities and random variables..................... 2 1.2 Probability distributions................................

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

STA205 Probability: Week 8 R. Wolpert

STA205 Probability: Week 8 R. Wolpert INFINITE COIN-TOSS AND THE LAWS OF LARGE NUMBERS The traditional interpretation of the probability of an event E is its asymptotic frequency: the limit as n of the fraction of n repeated, similar, and

More information

One-Parameter Processes, Usually Functions of Time

One-Parameter Processes, Usually Functions of Time Chapter 4 One-Parameter Processes, Usually Functions of Time Section 4.1 defines one-parameter processes, and their variations (discrete or continuous parameter, one- or two- sided parameter), including

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek Bhrushundi

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

Universal examples. Chapter The Bernoulli process

Universal examples. Chapter The Bernoulli process Chapter 1 Universal examples 1.1 The Bernoulli process First description: Bernoulli random variables Y i for i = 1, 2, 3,... independent with P [Y i = 1] = p and P [Y i = ] = 1 p. Second description: Binomial

More information

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes

Chapter 2. Some basic tools. 2.1 Time series: Theory Stochastic processes Chapter 2 Some basic tools 2.1 Time series: Theory 2.1.1 Stochastic processes A stochastic process is a sequence of random variables..., x 0, x 1, x 2,.... In this class, the subscript always means time.

More information

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr. Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick

More information

Notes 1 : Measure-theoretic foundations I

Notes 1 : Measure-theoretic foundations I Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,

More information

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions Chapter 5 andom Variables (Continuous Case) So far, we have purposely limited our consideration to random variables whose ranges are countable, or discrete. The reason for that is that distributions on

More information

Week 12-13: Discrete Probability

Week 12-13: Discrete Probability Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible

More information

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27 Probability Review Yutian Li Stanford University January 18, 2018 Yutian Li (Stanford University) Probability Review January 18, 2018 1 / 27 Outline 1 Elements of probability 2 Random variables 3 Multiple

More information

IEOR 6711: Stochastic Models I SOLUTIONS to the First Midterm Exam, October 7, 2008

IEOR 6711: Stochastic Models I SOLUTIONS to the First Midterm Exam, October 7, 2008 IEOR 6711: Stochastic Models I SOLUTIONS to the First Midterm Exam, October 7, 2008 Justify your answers; show your work. 1. A sequence of Events. (10 points) Let {B n : n 1} be a sequence of events in

More information

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t 2.2 Filtrations Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of σ algebras {F t } such that F t F and F t F t+1 for all t = 0, 1,.... In continuous time, the second condition

More information

Quick review on Discrete Random Variables

Quick review on Discrete Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 2017 Néhémy Lim Quick review on Discrete Random Variables Notations. Z = {..., 2, 1, 0, 1, 2,...}, set of all integers; N = {0, 1, 2,...}, set of natural

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3) STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 07 Néhémy Lim Moment functions Moments of a random variable Definition.. Let X be a rrv on probability space (Ω, A, P). For a given r N, E[X r ], if it

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

1 Exercises for lecture 1

1 Exercises for lecture 1 1 Exercises for lecture 1 Exercise 1 a) Show that if F is symmetric with respect to µ, and E( X )

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A ) 6. Brownian Motion. stochastic process can be thought of in one of many equivalent ways. We can begin with an underlying probability space (Ω, Σ, P) and a real valued stochastic process can be defined

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 17-27 Review Scott Sheffield MIT 1 Outline Continuous random variables Problems motivated by coin tossing Random variable properties 2 Outline Continuous random variables Problems

More information

Stat 426 : Homework 1.

Stat 426 : Homework 1. Stat 426 : Homework 1. Moulinath Banerjee University of Michigan Announcement: The homework carries 120 points and contributes 10 points to the total grade. (1) A geometric random variable W takes values

More information

DS-GA 1002 Lecture notes 2 Fall Random variables

DS-GA 1002 Lecture notes 2 Fall Random variables DS-GA 12 Lecture notes 2 Fall 216 1 Introduction Random variables Random variables are a fundamental tool in probabilistic modeling. They allow us to model numerical quantities that are uncertain: the

More information

Preliminaries. Probability space

Preliminaries. Probability space Preliminaries This section revises some parts of Core A Probability, which are essential for this course, and lists some other mathematical facts to be used (without proof) in the following. Probability

More information

18.175: Lecture 2 Extension theorems, random variables, distributions

18.175: Lecture 2 Extension theorems, random variables, distributions 18.175: Lecture 2 Extension theorems, random variables, distributions Scott Sheffield MIT Outline Extension theorems Characterizing measures on R d Random variables Outline Extension theorems Characterizing

More information