Lectures Random Variables

Size: px
Start display at page:

Download "Lectures Random Variables"

Transcription

1 Lectures 1 1 Random Variables Definition: A random variable (rv or RV) is a real valued function defined on the sample space. The term random variable is a misnomer, in view of the normal usage of function and variable. Random variables are denoted by capital letters from the end of the alphabet, e.g. X, Y, Z, but other letters are used as well, e.g., B, M, etc. Hence: X : S R and X(e) for e S is a particular value of X. Since the e are random outcomes of the experiment, it will make the function values X(e) random as a consequence. Hence the terminology random variable, although random function value might have been less confusing. Random variables are simply a bridge from the sample space to the realm of numbers, where we can perform arithmetic. A random variable is different from a random function, where the evaluation for each e is a function trajectory, e.g., against time as in a stock market index for a given day. Such random functions are also known as stochastic processes. We will only have limited exposure to them. Example 1 (Roll of Two Dice): The sum X of the two numbers facing up is an rv. Example (Toss of Three Coins): The number X of heads in the toss of three coins is a random variable. Compute P (X = i) = P ({X = i}) for i = 0, 1,, 3. The event {X = i} stands short for {e S : X(e) = i}. Example 3 (Urn Problem): Three balls are randomly selected (without replacement) from an urn containig 0 balls labeled 1,,..., 0. We bet that we will get at least one label 17. What is the probability of winning the bet. This problem could be solved without involving the notion of a random variable. For the sake of working with the concept of a random variable let X be the maximum number of the three balls drawn. Hence we are interested in P (X 17), which is computed as follows: P (X 17) = P (X = 17) + P (X = 18) + P (X = 19) + P (X = 0) = 1 P (X 16) with P (X = i) = i 1 ( 0 3 ) for i = 3, 4,..., 0, = P (X 17) = ( 16 ) ( 17 ) ( 18 ) ( 19 ) ( ) = = = 1 ( 16 3 ) ( 0 3 ) Example 4 (Coin Toss with Stopping Rule): A coin (with probability p of heads) is tossed until either a head is obtained or until n tosses are made. Let X be the number of tosses made. Find P (X = i) for i = 1,..., n. Solution: P (X = i) = (1 p) i 1 p for i = 1,..., n 1 and P (X = n) = (1 p) n 1. Check that probabilities add to 1. Example 5 (Coupon Collector Problem): There are N types of coupons. Each time a coupon is obtained, it is, independently of previous selections, equally likely to be one of the N types. We are interested in the random variable T = the number of coupons that needs to be collected to get a full set of N coupons. Rather than get P (T = n) immediately, we obtain P (T > n). 1

2 Let A j be the event that coupon j is not among the first n collected coupons. By the inclusionexclusion formula we have ( N ) N P (T > n) = P A i = P (A i ) P (A i1 A i ) ( 1) N+1 P (A 1 A... A N ) i 1 <i with P (A 1 A... A N ) = 0, of course, and for i 1 < i <... we have N 1 n N n n N k P (A i ) =, P (A i1a i) =,..., P (A i1a i... A ik N N ) = N and thus N 1 n (N ) N n P (T > n) = N + N N N 1 (N ) N i n = ( 1) i+1 i N ( N 3 ) (N 3 N ) n N 1 n... + ( 1) N N 1 N and P (T > n 1) = P (T = n) + P (T > n), hence P (T = n) = P (T > n 1) P (T > n). Distribution Functions Example 6 (First Heads): Toss fair coin until first head lands up. Let X be the number of tosses required. Then P (X k) = 1 P (X k + 1) = 1 P (k tails in first k tosses) = k. Definition: The cumulative distribution function (cdf or CDF) or more simply the distibution function F of the random variable X is defined for all real numbers b as F (b) = P (X b) = P ({e : X(e) b}). L1 ends Example 7 (Using a CDF): Suppose the cdf of the random variable X is given by F (x) = 0 x < 0 x 0 x < x < 11 1 x < x F(x) Compute P (X < 3), P (X = 1), P (X >.5) and P ( < X 4). 11, 1 = 1, =.75, 1 = x

3 Discrete Random Variables Definition: A random variable X which can take on at most a countable number of values is called a discrete random variable. For such a discrete random variable we define its probability mass function (pmf) p(a) of X by p(a) = P (X = a) = P ({e : X(e) = a}) for all a R. p(a) is positive for at most a countable number of values of a. If X assumes only one of the following values x 1, x, x 3,... then p(x i ) 0 for i = 1,, 3,... and p(x) = 0 for all other values of x Graphical representation of p(x) (one die, sum of two dice): Example 8 (Poisson): Suppose the discrete random variable X has pmf p(i) = cλ i /i! for i = 0, 1,,... where λ is some positive value and c = exp( λ) makes the probabilities add to one. Find P (X = 0) and P (X > ). P (X = 0) = c = exp( λ), P (X > ) = 1 P (X = 0) P (X = 1) P (X = ) = 1 exp( λ)(1 + λ + λ /). The cdf F of a discrete random variable X can be expressed as F (a) = p(x). x: x a The c.d.f. of a discrete random variable X is a step function with a possible step at each of its possible values x 1, x,... and being flat in between. Example 9 (Discrete CFD): p(1) =.5, p() =.5, p(3) =.15 and p(4) =.15, construct the c.d.f. and graph it. Interpret the step size. Expected Value or Mean of X A very important concept in probability theory is that of the expected value or mean of a random variable X. For a discrete RV it is defined as E[X] = E(X) = µ = µ X = x p(x) = x p(x) x:p(x)>0 x the probability weighted average of all possible values of X. If X takes on the two values 0 and 1 with probabilities p(0) =.5 and p(1) =.5 then E[X] = =.5, which is half way between 0 and 1. When p(1) = /3 and p(0) = 1/3, then E[X] = /3, twice as close to 1 than to 0. That s because the probability of 1 is twice that of 0. The double weight /3 at 1 balances the weight of 1/3 at 0, when the fulcrum of balance is set at /3 = E[X]. weight 1 moment arm 1 = weight moment arm or /3 1/3 = 1/3 /3, where the moment arm is measured as the distance of the weight from the fulcrum, here at /3 = E[X]. moment arm 1 = /3 1 = 1/3 and moment arm = /3 0 = /3 This is a general property of E[X], not just limited to RVs with two values. If a is the location of the fulcrum, then we get balance when x)p(x) = x<a(a x>a(x a)p(x) or 0 = x<a(a x)p(x) + x>a(x a)p(x) = (x a)p(x) x 3

4 or 0 = x xp(x) a x p(x) or 0 = E[X] a or a = E[X] The term expectation can again be linked to our long run frequency motivation for probabilities. If we play the same game repeatedly, say a large number N times, with payoffs being one of the amounts x 1, x, x 3,..., then we would roughly see these amounts with approximate relative frequencies p(x 1 ), p(x ), p(x 3 ),..., i.e., with approximate frequencies Np(x 1 ), Np(x ), Np(x 3 ),..., thus realizing in N such games the following total payoff: L13 ends Np(x 1 ) x 1 + Np(x ) x + Np(x 3 ) x i.e., on a per game basis Np(x 1 ) x 1 + Np(x ) x + Np(x 3 ) x N On average we expect to win E[X] (or lose E[X], if E[X] < 0). = p(x 1 ) x 1 + p(x ) x + p(x 3 ) x = E[X] Example 10 (Rolling a Fair Die): If X is the number showing face up on a fair die, we get E[X] = = = 7 Indicator Variable: For any event E we can define the indicator RV I = I E (e) = 1 if e E and I E (e) = 0 if e / E E[I] = P (E) 1 + P (E c ) 0 = P (E) Example 11 (Quiz Show): You are asked two different types of questions, but the second one only when you answer the first correctly. When you answer a question of type i correctly you get a prize of V i dollars. In which order should you attempt to answer the question types, when you know your chances of answering questions of type i are P i, i = 1,, respectively? Or does the order even matter? Assume that the events of answering questions are independent? If you choose to answer a question of type i = 1 first, your winnings are 0 with probability 1 P 1 V 1 with probability P 1 (1 P ) V 1 + V with probability P 1 P with expected winnings W 1 : E[W 1 ] = V 1 P 1 (1 P ) + (V 1 + V )P 1 P. When answering the type question first, you get the same expression with indices exchanged, i.e., E[W ] = V P (1 P 1 ) + (V 1 + V )P 1 P. Thus E[W 1 ] > E[W ] V 1 P 1 (1 P ) > V P (1 P 1 ) V 1P 1 1 P 1 > V P 1 P the choice should be ordered by odds-weighted payoffs. Example: P 1 =.8, V 1 = 900, P =.4, V = 6000, then = 3600 < = E[W..6 1 ] = 640 < E[W ] =

5 Expectation of g(x): For an RV X and a function g : R R, we can view Y = g(x) again as a random variable. Find its expectation. Two ways: Find the p X (x)-weighted average of all g(x) values, or given the pmf p X (x) of X, find the pmf p Y (y) of Y = g(x), then its expectation as the p Y (y) weighted average of all Y values. Example: Let X have values,, 4 with p( X ( ) =.5, p X () =.5, p X (4) =.5, respectively. = x p X (x) x x p X (x) = y = x p Y (y) yp Y (y) p Y (4) {}}{ x p X (x) = ( ) = 4 (.5 +.5) = x y yp Y (y) = 10 What we see in this special case holds in general for discrete RVs X and functions Y = g(x) E[Y ] = y yp Y (y) = x g(x)p X (x) = E[g(X)] The formal proof idea is already contained in the above example, so we skip it, but see book for notational formal proof or the graphic above. 5

6 Example 1 (Business Planning): A seasonal product (say skis), when sold in timely fashion, yields a net profit of b dollars for each unit sold, and a net loss of l dollars, when it needs to be sold at season s end at a fire sale. Assume that the customer demand for the number X of units is an RV with pmf p X (x) = p(x) and assume s units are stocked. When X > s, the excess orders cannot be filled. Then the realized profit Q(s) is an RV, namely { bx (s X)l if X s Q(s) = sb if X > s with expected profit [ ] s s s s E[Q(s)] = (bi (s i)l)p(i) + sb p(i) = (b + l) ip(i) sl p(i) + sb 1 p(i) i=s+1 s s s = (b + l) ip(i) s(b + l) p(i) + sb = sb + (b + l) (i s)p(i) Find the value s that maximizes this expected value. We examine what happens to E[Q(s)] as we increase s to s + 1. s+1 E[Q(s + 1)] = (s + 1)b + (b + l) (i s 1)p(i) = (s + 1)b + (b + l) s = E[Q(s + 1)] E[Q(s)] = b (b + l) p(i) > 0 s (i s 1)p(i) s p(i) < b b + l Since s p(i) increases with s and since b/(b + l) is constant, there is a largest s, say s, for which this inequality holds, and thus the maximum expected profit is E[Q(s + 1)], achieved when stocking s + 1 items. We need to know p(i), i = 0, 1,..., e.g., from past experience. Examples of E[g(X)]: 1) Let g(x) = ax + b, with constants a, b, then E[aX + b] = x (ax + b)p(x) = a x xp(x) + b x p(x) = ae[x] + b ) let g(x) = x n, then E[X n ] = x x n p(x) is called the n th moment of X,and E[X] is also known as the first moment. The Variance of X L14 ends While E[X] is a measure of the center of a distribution given by a pmf p(x), we also like to have some measure of the spread or variation of a distribution. While E[X] = 0 for X 0 with probability 1, or X = ±1 with probability 1/ each or X = ±100 with probability 1/ each, we would view the variabilities of these three situations quite differently. One plausible measure would be the expected absolute difference of X from its mean, i.e., E[ X µ ], where µ = E[X]. For the above three situations we would get E[ X µ ] = 0, 1, 100, respectively. While this was easy enough, it turns out the the absolute value function X µ is not very conducive to 6

7 manipulations. We introduce a different measure that can be exploited much more conveniently as we will see later on. Definition: The variance of a random variable X with mean µ = E[X] is defined as var(x) = E[(X µ) ] An alternate formula, and example of the manipulative capability of the variance definition, is var(x) = E[X µx + µ ] = x (x µx + µ )p(x) = x x p(x) x µxp(x) + x µ p(x) = E[X ] µe[x] + µ = E[X ] µ = E[X ] (E[X]) Example 13 (Variance of a Fair Die): If X denotes the face up of a randomly rolled fair die then E[X ] = = 91 6 and var(x) = = 35 1 Variance of ax + b: For constants a and b we have var(ax + b) = a var(x), since var(ax + b) = E[{(aX + b (aµ + b)} ] = E[a (X µ) ] = a E[(X µ) ] = a var(x) In analogy to the center of gravity interpretation of E[X] we can view var(x) as the moment of inertia of the pmf p(x), when viewing p(x) as weight in mechanics. While squaring the deviation of X around µ in the definition of var(x), it creates a distortion and changes any units of measurements to square units. To bring matters back to its original units we take the square root of the variance, i.e., the standard deviation SD(X), as the appropriate measure of spread SD(X) = σ = σ X = var(x) We now discuss several special discrete distributions. Bernoulli and Binomial Random Variables Aside from the constant random variable which takes on only one value, the next level of simplicity is a random variable with only two values, most often 0 and 1, (canonical choice). Definition (Bernoulli Random Variable): A random variable X which can take on only the two values 0 and 1 is called a Bernoulli random variable. We indicate its distribution by X B(p). In liberal notational usage we also write P (X x) = P (B(p) x). Such random variables are often employed when we focus on an event E in a particular random experiment. Let p = P (E). If E occurs we say the experiment results in a success and otherwise we call it a failure. The Bernoulli rv X is then defined as follows: X(e) = 1 if e E and X(e) = 0 if e E. Hence X counts the number successes in one performance of the experiment. Often the following alternate notation is used: I E (e) = 1 if e E and I E (e) = 0 otherwise. I E is then also called the indicator function of E. 7

8 The probability mass function of X or I E is p(0) = P (X = 0) = P ({e : X(e) = 0}) = P (E c ) = 1 p p(1) = P (X = 1) = P ({e : X(e) = 1}) = P (E) = p where p is usually called the success probability. The mean and variance of X B(p) is E[X] = (1 p) 0 + p p = p and var(x) = E[X ] (E[X]) = E[X] p = p p = p(1 p) where we exploited X X. If we perform n independent repetitions of this basic experiment, i.e. n independent trials, then we can talk of another random variable Y, namely the number of successes in these n trials. Y is called a binomial random variable and we indicate its distribution by Y Bin(n, p), again liberally writing P (Y y) = P (Bin(n, p) y). For parameters n and p, the probability mass function of Y is (as derived previously) n p(i) = P (Y = i) = p i (1 p) n i for i = 0, 1,,..., n. 1 i Example 14 (Coin Flips): Flip 5 fair coins and denote by X the number of heads in these 5 flips. Get the probability mass function of X. Example 15 (Quality Assurance): A company produces parts. The probability that any given part will be defective is.01. The parts are shipped in batches of 10 and the promise is made that any batch with two or more defectives will be replaced by two new batches of 10 each. What proportion of the batches will need to be replaced? Solution: 1 P (X = 0) P (X = 1) = 1 (1 p) 10 10p(1 p) 9 =.0043 where p =.01. Hence about.4% of the batches will be affected. Example 16: (Chuck-a-luck): A player bets on a particular number i = 1,, 3, 4, 5, 6 of a fair die. The die is rolled 3 times and if the chosen bet number appears k = 1,, 3 times the player wins k units, otherwise loses 1 unit. If X denotes the payoff, what is the expected value E[X] of the game? P (X = 1) = P (X = ) = ( = 0 6) ( = 6) , P (X = 1) =, P (X = 3) = (1 3 5 = 1 6) , ( = 3 6) = E[X] = = with an expected loss of units per game in the long run. Example 17 (Genetics): A particular trait (eye color or left-handedness) on a person is governed by a particular gene pair, which can either be {d, d}, {d, r} or {r, r}. The dominant L15 ends 1 With appropriate values for i, n and p you get p(i) via the command dbinom(i,n,p) in R, while pbinom(i,n,p) returns P (Y i). In EXCEL get these via =BINOMDIST(i,n,p,FALSE) and =BINOMDIST(i,n,p,TRUE), respectively. You may also use the spreadsheet available within the free OpenOffice 1-pbinom(1,10,.01) in R and in EXCEL via = 1-BINOMDIST(1,10,.01,TRUE). 8

9 gene d dominates over the recessive r, i.e., the trait shows whenever there is a d in the gene pair. An offspring from two parents inherits randomly one gene from each gene pair of its parents. If both parents are hybrids ({d, r}) what is the chance that of 4 offspring at least 3 show the outward appearance of the dominant gene? Solution: p = 3/4 is the probability that any given offspring will have gene pair {d, d} or {d, r}. Hence P = 4(3/4) 3 (1/4) + 1(3/4) 4 = 189/56 =.74. Example 18 (Reliability): On an aircraft we want to compare the reliability (probability of functioning) of a 3 out of 5 system with a out of 3 system. A k out of n system functions whenever a majority of the subsystems function properly. Usually n is chosen as odd. We assume that the probability of failure 1 p is the same all subsystems and that failures occur independently. A 3 out of 5 system has a higher reliability than a out of 3 system whenever p 3 (1 p) + p 4 (1 p)+ p 5 > p (1 p)+ 3 p 3 (1 p)(p 1) > 0 p > 1 3 reliability out of 5 system out of 3 system reliability out of 5 system out of 3 system p p 9

10 Mean and Variance of X Bin(n, p): Using the simple identities n n 1 i = n i i 1 n n and i(i 1) = n(n 1) i i n n 1 = E[X] = i p i (1 p) n i = np p i 1 (1 p) n 1 (i 1) i i 1 n 1 n 1 substituting i 1 = j = np p j (1 p) n 1 j = np j=0 j Note the connection to Bernoulli RVs X i, indicating success or failure in the i th trial and E[X] = E[X X n ] = E[X 1 ] E[X n ] = np. Expectation of a sum = sum of the individual (finite) expectations. E[X(X 1)] = n n i(i 1) p i (1 p) n i = n(n 1)p p i (1 p) n (i ) i i= i n substituting i = j = n(n 1)p n p j (1 p) n j = n(n 1)p j=0 j n(n 1)p = E[X(X 1)] = E[X X] = (x x)p(x) = x x = E[X ] E[X] = E[X ] np = E[X ] = np + n(n 1)p = np(1 p) + (np) = var(x) = E[X ] (E[X]) = np(1 p) x p(x) x xp(x) Note again var(x) = var(x X n ) = var(x 1 ) var(x n ) = np(1 p) Variance of a sum of independent RVs = sum of the (finite) variances of those RVs. Qualitative Behavior of Binomial Probability Mass Function: If X is a binomial random variable with parameters (n, p) then the probability mass function p(x) of X first increases monotonically and then decreases monotonically, reaching its largest value when x is the largest integer (n + 1)p. p n x 1 p x+1 Proof: Look at p(x + 1)/p(x) = > 1 or < 1 (n + 1)p > x + 1 or < x + 1. Of course it is possible that p(x) is entirely monotone (when?). Illustrate with Pascal s triangle. The Poisson Random Variable L16 ends Definition: A random variable X with possible values 0, 1,,... is called a Poisson random variable, indicated by X P ois(λ), if for some constant λ > 0 its pmf is given by λ λi p(i) = P (X = i) = P (P ois(λ) = i) = e i! for i = 0, 1,, Check summation to 1. 3 In R get p(i) = P (X = i) via the command dpois(i,lambda), while P (X i) is obtained by ppois(i,lambda). In EXCEL you get the same by =POISSON(i,lambda,FALSE) and =POISSON(i,lambda,TRUE), respectively. 10

11 Approximation to a binomial random variable for small p and large n: Let X be a binomial rv with parameters n and p. Let n get large and let p get small so that λ = np does neither degenerate to 0 nor, then P (X = i) = = n! i!(n i) pi (1 p) n i = n! i!(n i)! i ( λ 1 λ ) n i n n n(n 1) (n i + 1) λ i (1 λ/n) n λ λi e n i i! (1 λ/n) i i!. Since np represents the expected or average number of successes of the n trials represented by the binomial random variable it should not be surprising that the Poisson prameter λ should be interpreted as the average or expected count for such a Poisson random variable. Actually, for the approximation to work it can be shown that small p is sufficient. In fact, if for i = 1,, 3,..., n the X i are independent Bernoulli random variables with respective success probabilities p i and if S = X X n and if Y is a Poisson random variable with parameter λ = n p i then or one can show that P (S x) P (Y x) 3(max(p 1,..., p n )) 1/3 for all x P (S x) P (Y x) p i for all x. Poisson-Binomial Approximation, see class web page. A Poisson random variable often serves as a good model for the count of rare events. Examples: Number of misprints on a page Number of telephone calls coming through an exchange Number of wrong numbers dialed Number of lightning strikes on commercial aircraft Number of bird ingestions into the engine of a jet Number of engine failures on a jet Number of customers coming into a post office on a given day Number of meteoroids striking an orbiting space station Number of discharged α particles from some radioactive source. Example 19: (Typos): Let X be the number of typos on a single page of a given book. Assume that X is Poisson with parameter λ =.5, i.e. we expect about half an error per page or about one error per every two pages. Find the probability of at least one error. Solution: P (X 1) = 1 P (X = 0) = 1 exp(.5) =.393. Example 0 (Defectives): A machine produces 10% defective items, i.e. an item coming off the machine has a chance of.1 of being defective. What is the chance that in the next 10 items coming off the machine we find at most one defective item? Solution: Let X be the number of defective items among the 10. P (X 1) = P (X = 0) + P (X = 1) = (.1) 0 (.9) (.1) 1 (.9) 9 =.7361 whereas using a Poisson random variable Y with parameter λ = 10(.1) = 1 we get P (Y 1) = P (Y = 0) + P (Y = 1) = e 1 + e 1 =

12 Mean and Variance of the Poisson Distribution: Based on the approximation of the Binomial(n, p) by a Poisson(λ = np) distribution when p is small, we would expect that E[Y ] np = λ and var(y ) np(1 p) λ. We now show that these approximations are in fact exact. E[Y ] = i e λ λ i i! E[Y ] = i e λ λ i i! = λ e λ λ i 1 (i 1)! = λ j=0 = λ i e λ λ i 1 (i 1)! = λ (j + 1) e λ λ j = λ j=0 j! j=0 e λ λ j j! j e λ λ j j! = λ = var(y ) = E[Y ] (E[Y ]) = λ + λ λ = λ e λ λ j + λ = λ + λ j! Poisson Distribution for Events in Time (Another Justification): Sometimes we observe random incidents occurring in time, e.g. arrival of customers, meteoroids, lightning etc. Quite often these random phenomena appear to satisfy the following basic assumptions for some positive constant λ: 1. The probability that exactly one incident occurs during an interval of length h is λh + o(h) where o(h) is a function of h which goes to 0 faster than h, i.e. o(h)/h 0 as h 0 (e.g. o(h) = h ). The concept/notation of o(h) was introduced by Edmund Landau. j=0 L17 ends. The probability that two or more incidents occur in an interval of length h is the same for all such intervals and equal to o(h). No clustering of incidents! 3. For any integers n, j 1,..., j n and any set of nonoverlapping intervals the events E 1,..., E n, with E i denoting the occurrence of exactly j i incidents in the i th interval, are independent. If N(t) denotes the number of incidents in a given interval of length t then it can be shown that N(t) is a Poisson random variable with parameter λt, i.e. P (N(t) = k) = e λt (λt) k /k!. Proof: Take as time interval [0, t] and divide it into n equal parts. P (N(t) = k) = P (k of the intervals contain exactly one incident and n k contain 0 incidents)+p (N(t) = k and at least one subinterval contains two or more incidents). The second probability can be bounded by P (i th interval contains at least two incidents) ( t o = n o(t/n) 0. n) The probability of 0 incidents in a particular interval of length t/n is 1 [λ(t/n) + o(t/n)] so that the first probability above becomes (in cavalier fashion, not quite air tight. See Poisson-Binomial Approximation on the class web page for a clean argument.) [ ] k [ n! λt k!(n k)! n + o(t/n) 1 λt ] n k n o(t/n) which converges to exp( λt)(λt) k /k!. Example 1 (Space Debris): It is estimated that the space station will be hit by space debris beyond a critical size and velocity on the average about once in 500 years. What is the chance that the station will survive the first 0 years without such a hit. Solution: T = 500 then λt = 1 or λ = 1/500. Now t = 0 and P (N(t) = 0) = exp( λt) = exp( 0/500) =

13 Geometric, Negative Binomial and Hypergeometric Random Variables Definition: In independent trials with success probability p the number X of trials required to get the first success is called a geometric random variable. We write X Geo(p) to indicate its distribution. Its probability mass function is p(n) = P (X = n) = P (Geo(p) = n) = (1 p) n 1 p for n = 1,, 3,... Check summation to 1. Some texts (and software, e.g., R and EXCEL as a special negative binomial) treat X 0 = X 1 = number of failures before the first success as the geometric RV. Then P (X 0 = n) = P (X = n + 1) = (1 p) n p for n = 0, 1,,.... Example (Urn Problem): An urn contains N white and M black balls. Balls are drawn with replacement until the first black ball is obtained. Find P (X = n) and P (X k), the latter in two ways. Probability of success = p = M/(M + N). P (X = n) = (1 p) n 1 p and P (X k) = (1 p) k 1 P (X k) = (1 p) i 1 p = (1 p) k 1 (1 p) i k p i=k i=k = p(1 p) k 1 (1 p) j = p(1 p) k 1 1 = (1 p)k 1 j=0 1 (1 p) Mean and Variance of X Geo(p): E[X] 1 = E[X 1] = (n 1)(1 p) n 1 p = (1 p) (n 1)(1 p) n p n=1 n= = (1 p) i(1 p) i 1 p = (1 p)e[x] = E[X](1 (1 p)) = 1 or E[X] = 1 p Fits intuition: If p = 1/1000, then it takes on average 1/p = 1000 trials to see one success. L18 ends E[X ] E[X] + 1 = E[(X 1) ] = (n 1) (1 p) n 1 p = (1 p) (n 1) (1 p) n p n=1 n= = (1 p) i (1 p) i 1 p = (1 p)e[x ] = E[X ](1 (1 p)) = p 1 or E[X ] = p 1 p or var(x) = E[X ] (E[X]) = p 1 p 1 p = 1 p p Definition: In independent trials with success probability p the number X of trials required to get the first r successes accumulated is called a negative binomial random variable. We write X N egbin(r, p) to indicate its distribution. Its probability mass function is n 1 p(n) = P (X = n) = P (NegBin(r, p) = n) = (1 p) n r p r for n = r, r + 1, r +,... r 1 13

14 For r = 1 we get the geometric distribution as a special case. Exploiting the equivalence of the two statements: it takes at least m trials to get r successes and in the first m 1 trials we have at most r 1 successes we have P (NegBin(r, p) m) = 1 P (NegBin(r, p) m 1) = P (Bin(m 1, p) r 1) (1) This facilitates the computation of the negative binomial cumulative probabilities in terms of appropriate binomial cumulative probabilities. We can view X as the sum of independent geometric random variables Y 1,..., Y r, each with success probability p. Here Y 1 denotes the number of trials to the first succes, Y the number of additional trials to the next success thereafter, and so on. Clearly, for i 1,..., i r {1,, 3,...} we have P (Y 1 = i 1,..., Y r = i r ) = P (Y 1 = i 1 )... P (Y r = i r ) () since the individual statements concern what specifically happens in the first i i r trials, all of which are independent, namely we have i 1 1 failures, then a success, then i 1 failures, then a success, and so on. From () it follows that for E 1,..., E r {1,, 3,...} we have P (Y 1 E 1,..., Y r E r ) =... P (Y 1 = i 1,..., Y r = i r ) i 1 E 1 i r E r =... P (Y 1 = i 1 )... P (Y r = i r ) i 1 E 1 i r E r distributive law of arithmetic = P (Y 1 = i 1 )... P (Y r = i r ) i 1 E 1 i r E r = P (Y 1 E 1 )... P (Y r E r ) The same holds for any subset of the Y 1,..., Y r, since () also holds for any subset. For example, summing the left and right side over all i 1 = 1,, 3,... yields P (Y 1 = i 1,..., Y r = i r ) = P (Y 1 = i 1 )... P (Y r = i r ) i 1 =1 i 1 =1 P (Y 1 <, Y = i,..., Y r = i r ) = P (Y 1 < ) P (Y = i )... P (Y r = i r ) P (Y = i,..., Y r = i r ) = P (Y = i )... P (Y r = i r ) and similarly by summing over any other and further indices. In particular we get 1 = P (Y 1 < )... P (Y r < ) = P (Y 1 <,..., Y r < ) P (Y Y r < ) = P (X < ) This means that the negative binomial pmf sums to 1, i.e., n 1 1 = P (X < ) = P (X = n) = (1 p) n r p r r 1 n=r Some texts (and software such as R and EXCEL) treat X 0 = X r = number of failures prior to the r th success as a negative binomial RV. Then P (X 0 = n) = P (X = n + r) for n = 0, 1,, n=r L19 ends 4 P (X 0 = n) and P (X 0 n) can be obtained in R by the commands dnbinom(n,r,p) and pnbinom(n,r,p), respectively, while in EXCEL use = NEGBINOMDIST(n,r,p) and =1-BINOMDIST(r-1,n+r,p,TRUE) based on (1). E.g., pnbinom(4,5,.) and =1-BINOMDIST(4,9,0.,TRUE) return

15 Example 3 (r Successes Before m Failures): If independent trials are performed with success probability p what is the chance of getting r successes before m failures? Solution: Let X be the number of trials required to get the first r successes. Then we need to find: P (X m + r 1) = P (X 0 m 1). Mean and Variance of X NegBin(r, p): Using n ( n 1 r 1) = r n r and V NegBin(r + 1, p) n 1 E[X k ] = n k p r (1 p) n r = r ) n k 1( n p r+1 (1 p) n r n=r r 1 p n=r r = r n (n + 1 1) k 1 p r+1 (1 p) n+1 (r+1) p n+1=r+1 r = r m 1 (m 1) k 1 p r+1 (1 p) m (r+1) = r p r p E[(V 1)k 1 ] = E[X] = r p and m=r+1 E[X ] = r p E[V 1] = r r p p = var(x) = r r + 1 r r p p p = p r(1 p) p If we write X again as X = Y Y r with independent Y i Geo(p), i = 1,..., r, we note again E[X] = E[Y Y r ] = E[Y 1 ] E[Y r ] = r p var(x) = var(y Y r ) = var(y 1 ) var(y r ) = r 1 p p Definition: If a sample of size n is chosen randomly and without replacement from an urn containing N balls, of which M = Np are white and N M = N Np are black, then the number X of white balls in the sample is called a hypergeometric random variable. To indicate its distribution we write X Hyper(n, M, N). Its possible values are x = 0, 1,..., n with pmf ) p(k) = P (X = k) = ( M N M k n k ( N n) (3) which is positive only if 0 k and k M and 0 n k and n k N M, i.e. if max(0, n N + M) k min(n, M). 5 Expression (3) also applies when drawing the n balls one by one without replacement since then ) M(M 1)... (M k + 1)(N M)(N M 1)... (N M (n k + 1)) P (X = k) = = ( n k M N M k n k ( N n) N(N 1)... (N n + 1) 5 In R we can obtain P (X = k) and P (X k) by the commands dhyper(k,m,n-m,n) and phyper(k,m,n-m,n), respectively. EXCEL only gives P (X = k) directly via =HYPGEOMDIST(k,n,M,N). For example, for M = 40, N = 100, n = 30 and k = 15 dhyper(15,40,60,30) and =HYPGEOMDIST(15,30,40,100) return P (X = 15) = , while phyper(15,40,60,30) returns P (X 15) =

16 Example 4 (Animal Counts): r animals are caught and tagged and released. After a reasonable time interval n animals are captured and the number X of tagged ones are counted. The total number N of animals is unknown. Then ) p N (i) = P (X = i) = ( r N r i n i ( N n) Find N which maximizes this p N (i) for the observed value X = i. p N (i) p N 1 (i) = (N r)(n n) N(N r n + i) 1 if and only if N rn/i. Hence our maximum likelihood estimate is ˆN = largest integer rn/i. Another way of motivating this estimate is to appeal to r/n i/n. Example 5 (Quality Control): Shipments of 1000 items each are inspected by selecting 10 without replacement. If the sample contains more than one defective then the whole shipment is rejected. What is the chance for rejecting a shipment if at most 5% of the shipment is bad. The probability of no rejection is P (X = 0) + P (X = 1) = ( hence the chance of rejecting a shipment is at most ) = Expections and Variances of X X n : First we prove a basic alternate formula for the expectation of a single random variable X: E[X] = x xp X (x) = s X(s)p(s) where the first expression involves the pmf p X (x) of X and sums over all possible values of X and the second expression involves the probability p(s) = P ({s}) for all elements s in the sample space S. The equivalence is seen as follows. For any of the possible values x of X let S x = {s S : X(s) = x}. For different values x the events/sets S x are disjoint and their union over all x is S. Thus xp X (x) = x x From this we get immediately = x xp ({s : X(s) = x}) = x s S x xp(s) = x x s S x p(s) s S x X(s)p(s) = s X(s)p(s) E[X X n ] = s (X 1 (s) X n (s))p(s) = s [X 1 (s)p(s) X n (s)p(s)] = s X 1 (s)p(s) s X n (s)p(s) = E[X 1 ] E[X ] provided the individual expectations are finite. 16

17 Next we will we address a corresponding formula for the variance of a sum of independent discrete random variables X 1,..., X n, namely var(x X n ) = var(x 1 ) var(x n ) provided the individual variances are finite. First we need to define the concept independence for a pair of random variables X and Y in concordance with the previously introduced independence of events. X and Y are independent, whenever for all possible values x and y of X and Y we have P (X = x, Y = y) = P ({s S : X(s) = x, Y (s) = y}) = P ({s S : X(s) = x})p ({s S : Y (s) = y}) = P (X = x)p (Y = y) As a consequence we have for independent X and Y with finite expectations the following property E[XY ] = E[X]E[Y ], i.e., E[XY ] E[X]E[Y ] = cov(x, Y ) = 0 where cov(x, Y ) is the covariance of X and Y, equivalently defined as cov(x, Y ) = E[(X E[X])(Y E[Y ])] = E[XY XE[Y ] Y E[X] + E[X]E[Y ]] = E[XY ] E[X]E[Y ] E[X]E[Y ] + E[X]E[Y ] = E[XY ] E[X]E[Y ] Proof of independence = cov(x, Y ) = 0: Let S xy = {s S : X(s) = x, Y (s) = y} E[XY ] = X(s)Y (s)p(s) = X(s)Y (s)p(s) by stepwise summation s x,y s S xy = xyp(s) = xy p(s) by distributive law x,y s S xy x,y s S xy = xyp (X = x, Y = y) = xyp (X = x)p (Y = y) by independence x,y x,y by distributive law = xp (X = x) yp (Y = y) = E[X]E[Y ] x y E [ (X X n ) ] = E Xi + X i X j = E [ ] X n i + E [X i X j ] i<j i<j = E [ ] Xi n + E[X i ]E[X j ] i<j (E[X X n ]) = (E[X 1 ] E[X n ]) = (E[X i ]) + E[X i ]E[X j ] i<j var(x X n ) = E [ (X X n ) ] (E[X X n ]) = = ( E[X i ] (E[X i ]) ) + (E [X i X j ] E[X i ]E[X j ]) i<j var(x i ) + cov(x i, X j ) = var(x i ) i<j where the last = holds for pairwise independence of X i and X j for i < j. 17

18 The above rules for mean and variance of Y = X X n are now illustrated for two situations. Let X 1,..., X n be indicator RVs indicating a success or failure in the i th of n trials. In the first situation we assume these trials are independent and have success probability p each. Then, as observed previously, from the mean and variance results for Bernoulli RVs we get E[Y ] = E(X X n ) = E[X i ] = np and var(y ) = var(x X n ) = var(x i ) = np(1 p) In the second situation we view the trials in the hypergeometric context, where X i = 1 when the i th ball drawn is white and X i = 0 otherwise. We argued previously that P (X i = 1) = M/N = p = proportion of white balls in the population = E[Y ] = E(X X n ) = E[X i ] = nm N For var(y ) we need to involve the covariance terms in our formula for var(x X n ). We find E[X i X j ] = M(M 1)(N )... (N n + 1) M(M 1) P (X i = 1, X j = 1) = = N(N 1)(N )... (N n + 1) N(N 1) cov(x i, X j ) = M(M 1) E[X i X j ] E[X i ]E[X j ] = N(N 1) M M N N = M N M p) = p(1 N N(N 1) N 1 var(y ) = var(x X n ) = var(x i ) + cov(x i, X j ) ( n p(1 p) = np(1 p) N 1 = np(1 p) 1 n 1 ) N 1 i<j = np(1 p) N n N 1 The factor 1 (n 1)/(N 1) is called the finite population correction factor. For fixed n it gets close to 1 when N is large, in which case it does not matter much whether we draw with or without replacement. One easily shows (exercise, or see Text p. 16) that P (Hyper(n, M, N) = k) P (Bin(n, p) = k) = n p k (1 p) n k k as N, where p = M/N We will now pull forward material from Ch. 8, namely the inequalities of Markov and Chebychev 6. Markov s Inequality: Let X be a nonnegative discrete RV with finite expectation E[X] then for any a > 0 we have P (X a) E[X] a Proof: E[X] = xp X (x)+ xp X (x) xp X (x) ap X (x) = ap (X a) x a x<a x a x a Markov s inequality is only meaningful for a > E[X]. It limits the probability far beyond the mean or expectation of X, in concordance with our previous center of gravity interpretation of E[X]. 6 Scholz Lehmann Neyman Sierpinsky Voronoy Markov Chebychev 18

19 While this inequality is usually quite crude, it can be sharp, i.e., result in equality. Namely, let X take the two values 0 and a with probability 1 p and p. Then p = P (X a) = E[X]/a. Chebychev s Inequality: Let X be a discrete RV with finite variance E[(X µ) ] = σ, then for any k > 0 we have P ( X µ k) σ k Proof by Markov s inequality using Y = (X µ) as our nonnegative RV P ( X µ k) = P ((X µ) k ) E[(X µ) ] k = σ k These inequalities hold also for RVs that are not discrete, but why wait that long for the following. We will now combine the above results into a theorem that proves the long run frequency notion that we have alluded to repeatedly, in particular when introducing the axioms of probability. Let X = (X X n )/n be the average of n independent and identically distributed random variables (telegraphically expressed as iid RVs), each with same mean µ = E[X i ] and variance σ = var(x i ). Such random variables can be the result of repeatedly observing a random variable X in independent repetitions of the same random experiment, like repeatedly tossing a coin or rolling a die, and denoting the resulting RVs by X 1,..., X n. Using the rules of expectation and variance (under independence) we have E[ X] = 1 n E[X X n ] = 1 (µ µ) = µ n var( X) = 1 n var(x X n ) = 1 n (σ σ ) = σ n and by Chebychev s inequality applied to X we get for any ɛ > 0 P ( X µ ɛ) σ n 1 0 as n, ɛ i.e., the probability that X will differ from µ by at least ɛ > 0 becomes vanishingly small. We say X converges to µ in probability and write X P µ as n. This result is called the weak law of large numbers (WLLN or LLN without emphasis on weak). When our random experiment consists of observing whether a certain event E occurs or not, we observe an indicator variable X = I E with values 1 and 0. If we repeatedly do this experiment (independently), we observe X 1,..., X n, each with mean µ = p = P (E) and variance σ = p(1 p). In that case X is the proportion of 1 s among the X 1,..., X n, i.e., the proportion of times we observe the event E. The above law of large numbers gives us X P µ = p = P (E) as n, i.e., in the long run the observed proportion or relative frequency of observing the event E converges to P (E). 19

20 Properties of Distribution Functions F : 1. F is nondecreasing, i.e. F (a) F (b) for a, b with a b.. lim b F (b) = 1 3. lim b F (b) = 0 4. F is right continuous, i.e. if b n b then F (b n ) F (b) or lim n F (b n ) = F (b). Proof: 1. For a b we have {e : X(e) a} {e : X(e) b}.., 3. and 4. = P (lim n E n ) = lim n P (E n ) for properly chosen monotone sequences E n. E.g., if b n b then E n = {e : X(e) b n } E = {e : X(e) b} = n=1 E n All probability questions about X can be answered in terms of the cdf F of X. For example, P (a < X b) = F (b) F (a) for all a b P (X < b) = lim n F (b 1 ) =: F (b ) n F (b) = P (X b) = P (X < b) + P (X = b) = F (b ) + (F (b) F (b )) 0

Week 12-13: Discrete Probability

Week 12-13: Discrete Probability Week 12-13: Discrete Probability November 21, 2018 1 Probability Space There are many problems about chances or possibilities, called probability in mathematics. When we roll two dice there are possible

More information

Discrete Probability Refresher

Discrete Probability Refresher ECE 1502 Information Theory Discrete Probability Refresher F. R. Kschischang Dept. of Electrical and Computer Engineering University of Toronto January 13, 1999 revised January 11, 2006 Probability theory

More information

Discrete Distributions

Discrete Distributions A simplest example of random experiment is a coin-tossing, formally called Bernoulli trial. It happens to be the case that many useful distributions are built upon this simplest form of experiment, whose

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Chapter 3 Discrete Random Variables

Chapter 3 Discrete Random Variables MICHIGAN STATE UNIVERSITY STT 351 SECTION 2 FALL 2008 LECTURE NOTES Chapter 3 Discrete Random Variables Nao Mimoto Contents 1 Random Variables 2 2 Probability Distributions for Discrete Variables 3 3 Expected

More information

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416) D. ARAPURA This is a summary of the essential material covered so far. The final will be cumulative. I ve also included some review problems

More information

Topic 3: The Expectation of a Random Variable

Topic 3: The Expectation of a Random Variable Topic 3: The Expectation of a Random Variable Course 003, 2017 Page 0 Expectation of a discrete random variable Definition (Expectation of a discrete r.v.): The expected value (also called the expectation

More information

Chapter 2: Discrete Distributions. 2.1 Random Variables of the Discrete Type

Chapter 2: Discrete Distributions. 2.1 Random Variables of the Discrete Type Chapter 2: Discrete Distributions 2.1 Random Variables of the Discrete Type 2.2 Mathematical Expectation 2.3 Special Mathematical Expectations 2.4 Binomial Distribution 2.5 Negative Binomial Distribution

More information

Common Discrete Distributions

Common Discrete Distributions Common Discrete Distributions Statistics 104 Autumn 2004 Taken from Statistics 110 Lecture Notes Copyright c 2004 by Mark E. Irwin Common Discrete Distributions There are a wide range of popular discrete

More information

ST 371 (V): Families of Discrete Distributions

ST 371 (V): Families of Discrete Distributions ST 371 (V): Families of Discrete Distributions Certain experiments and associated random variables can be grouped into families, where all random variables in the family share a certain structure and a

More information

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) = Math 5. Rumbos Fall 07 Solutions to Review Problems for Exam. A bowl contains 5 chips of the same size and shape. Two chips are red and the other three are blue. Draw three chips from the bowl at random,

More information

Relationship between probability set function and random variable - 2 -

Relationship between probability set function and random variable - 2 - 2.0 Random Variables A rat is selected at random from a cage and its sex is determined. The set of possible outcomes is female and male. Thus outcome space is S = {female, male} = {F, M}. If we let X be

More information

STAT509: Discrete Random Variable

STAT509: Discrete Random Variable University of South Carolina September 16, 2014 Motivation So far, we have already known how to calculate probabilities of events. Suppose we toss a fair coin three times, we know that the probability

More information

p. 4-1 Random Variables

p. 4-1 Random Variables Random Variables A Motivating Example Experiment: Sample k students without replacement from the population of all n students (labeled as 1, 2,, n, respectively) in our class. = {all combinations} = {{i

More information

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions

Outline PMF, CDF and PDF Mean, Variance and Percentiles Some Common Distributions. Week 5 Random Variables and Their Distributions Week 5 Random Variables and Their Distributions Week 5 Objectives This week we give more general definitions of mean value, variance and percentiles, and introduce the first probability models for discrete

More information

STAT 414: Introduction to Probability Theory

STAT 414: Introduction to Probability Theory STAT 414: Introduction to Probability Theory Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical Exercises

More information

Random Variables. Statistics 110. Summer Copyright c 2006 by Mark E. Irwin

Random Variables. Statistics 110. Summer Copyright c 2006 by Mark E. Irwin Random Variables Statistics 110 Summer 2006 Copyright c 2006 by Mark E. Irwin Random Variables A Random Variable (RV) is a response of a random phenomenon which is numeric. Examples: 1. Roll a die twice

More information

Review of Probability. CS1538: Introduction to Simulations

Review of Probability. CS1538: Introduction to Simulations Review of Probability CS1538: Introduction to Simulations Probability and Statistics in Simulation Why do we need probability and statistics in simulation? Needed to validate the simulation model Needed

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Discrete Random Variables

Discrete Random Variables CPSC 53 Systems Modeling and Simulation Discrete Random Variables Dr. Anirban Mahanti Department of Computer Science University of Calgary mahanti@cpsc.ucalgary.ca Random Variables A random variable is

More information

A First Course in Probability Sheldon Ross Ninth Edition

A First Course in Probability Sheldon Ross Ninth Edition A First Course in Probability Sheldon Ross Ninth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world Visit us on the World Wide

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1). Name M362K Final Exam Instructions: Show all of your work. You do not have to simplify your answers. No calculators allowed. There is a table of formulae on the last page. 1. Suppose X 1,..., X 1 are independent

More information

STAT 418: Probability and Stochastic Processes

STAT 418: Probability and Stochastic Processes STAT 418: Probability and Stochastic Processes Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Introduction to Probability and Statistics Lecture 10: Expectation and Variance Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/ psarkar/teaching

More information

Homework 4 Solution, due July 23

Homework 4 Solution, due July 23 Homework 4 Solution, due July 23 Random Variables Problem 1. Let X be the random number on a die: from 1 to. (i) What is the distribution of X? (ii) Calculate EX. (iii) Calculate EX 2. (iv) Calculate Var

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Probability reminders

Probability reminders CS246 Winter 204 Mining Massive Data Sets Probability reminders Sammy El Ghazzal selghazz@stanfordedu Disclaimer These notes may contain typos, mistakes or confusing points Please contact the author so

More information

STAT 430/510 Probability Lecture 7: Random Variable and Expectation

STAT 430/510 Probability Lecture 7: Random Variable and Expectation STAT 430/510 Probability Lecture 7: Random Variable and Expectation Pengyuan (Penelope) Wang June 2, 2011 Review Properties of Probability Conditional Probability The Law of Total Probability Bayes Formula

More information

1 Basic continuous random variable problems

1 Basic continuous random variable problems Name M362K Final Here are problems concerning material from Chapters 5 and 6. To review the other chapters, look over previous practice sheets for the two exams, previous quizzes, previous homeworks and

More information

Chapter 4 : Discrete Random Variables

Chapter 4 : Discrete Random Variables STAT/MATH 394 A - PROBABILITY I UW Autumn Quarter 2015 Néhémy Lim Chapter 4 : Discrete Random Variables 1 Random variables Objectives of this section. To learn the formal definition of a random variable.

More information

Random Variables. Lecture 6: E(X ), Var(X ), & Cov(X, Y ) Random Variables - Vocabulary. Random Variables, cont.

Random Variables. Lecture 6: E(X ), Var(X ), & Cov(X, Y ) Random Variables - Vocabulary. Random Variables, cont. Lecture 6: E(X ), Var(X ), & Cov(X, Y ) Sta230/Mth230 Colin Rundel February 5, 2014 We have been using them for a while now in a variety of forms but it is good to explicitly define what we mean Random

More information

Chapter 3: Discrete Random Variable

Chapter 3: Discrete Random Variable Chapter 3: Discrete Random Variable Shiwen Shen University of South Carolina 2017 Summer 1 / 63 Random Variable Definition: A random variable is a function from a sample space S into the real numbers.

More information

Statistics for Economists. Lectures 3 & 4

Statistics for Economists. Lectures 3 & 4 Statistics for Economists Lectures 3 & 4 Asrat Temesgen Stockholm University 1 CHAPTER 2- Discrete Distributions 2.1. Random variables of the Discrete Type Definition 2.1.1: Given a random experiment with

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Analysis of Engineering and Scientific Data. Semester

Analysis of Engineering and Scientific Data. Semester Analysis of Engineering and Scientific Data Semester 1 2019 Sabrina Streipert s.streipert@uq.edu.au Example: Draw a random number from the interval of real numbers [1, 3]. Let X represent the number. Each

More information

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Random Variable Discrete Random

More information

What is a random variable

What is a random variable OKAN UNIVERSITY FACULTY OF ENGINEERING AND ARCHITECTURE MATH 256 Probability and Random Processes 04 Random Variables Fall 20 Yrd. Doç. Dr. Didem Kivanc Tureli didemk@ieee.org didem.kivanc@okan.edu.tr

More information

Discrete Random Variable

Discrete Random Variable Discrete Random Variable Outcome of a random experiment need not to be a number. We are generally interested in some measurement or numerical attribute of the outcome, rather than the outcome itself. n

More information

Brief Review of Probability

Brief Review of Probability Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions

More information

Fundamental Tools - Probability Theory II

Fundamental Tools - Probability Theory II Fundamental Tools - Probability Theory II MSc Financial Mathematics The University of Warwick September 29, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory II 1 / 22 Measurable random

More information

Class 26: review for final exam 18.05, Spring 2014

Class 26: review for final exam 18.05, Spring 2014 Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event

More information

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek Bhrushundi

More information

CME 106: Review Probability theory

CME 106: Review Probability theory : Probability theory Sven Schmit April 3, 2015 1 Overview In the first half of the course, we covered topics from probability theory. The difference between statistics and probability theory is the following:

More information

Lecture 3. Discrete Random Variables

Lecture 3. Discrete Random Variables Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan 2.4 Random Variables Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan By definition, a random variable X is a function with domain the sample space and range a subset of the

More information

Probability, Random Processes and Inference

Probability, Random Processes and Inference INSTITUTO POLITÉCNICO NACIONAL CENTRO DE INVESTIGACION EN COMPUTACION Laboratorio de Ciberseguridad Probability, Random Processes and Inference Dr. Ponciano Jorge Escamilla Ambrosio pescamilla@cic.ipn.mx

More information

Topic 3: The Expectation of a Random Variable

Topic 3: The Expectation of a Random Variable Topic 3: The Expectation of a Random Variable Course 003, 2016 Page 0 Expectation of a discrete random variable Definition: The expected value of a discrete random variable exists, and is defined by EX

More information

MAS113 Introduction to Probability and Statistics. Proofs of theorems

MAS113 Introduction to Probability and Statistics. Proofs of theorems MAS113 Introduction to Probability and Statistics Proofs of theorems Theorem 1 De Morgan s Laws) See MAS110 Theorem 2 M1 By definition, B and A \ B are disjoint, and their union is A So, because m is a

More information

Lectures on Elementary Probability. William G. Faris

Lectures on Elementary Probability. William G. Faris Lectures on Elementary Probability William G. Faris February 22, 2002 2 Contents 1 Combinatorics 5 1.1 Factorials and binomial coefficients................. 5 1.2 Sampling with replacement.....................

More information

Topic 3 Random variables, expectation, and variance, II

Topic 3 Random variables, expectation, and variance, II CSE 103: Probability and statistics Fall 2010 Topic 3 Random variables, expectation, and variance, II 3.1 Linearity of expectation If you double each value of X, then you also double its average; that

More information

MATH Notebook 5 Fall 2018/2019

MATH Notebook 5 Fall 2018/2019 MATH442601 2 Notebook 5 Fall 2018/2019 prepared by Professor Jenny Baglivo c Copyright 2004-2019 by Jenny A. Baglivo. All Rights Reserved. 5 MATH442601 2 Notebook 5 3 5.1 Sequences of IID Random Variables.............................

More information

Notes 12 Autumn 2005

Notes 12 Autumn 2005 MAS 08 Probability I Notes Autumn 005 Conditional random variables Remember that the conditional probability of event A given event B is P(A B) P(A B)/P(B). Suppose that X is a discrete random variable.

More information

Lecture 16. Lectures 1-15 Review

Lecture 16. Lectures 1-15 Review 18.440: Lecture 16 Lectures 1-15 Review Scott Sheffield MIT 1 Outline Counting tricks and basic principles of probability Discrete random variables 2 Outline Counting tricks and basic principles of probability

More information

Part (A): Review of Probability [Statistics I revision]

Part (A): Review of Probability [Statistics I revision] Part (A): Review of Probability [Statistics I revision] 1 Definition of Probability 1.1 Experiment An experiment is any procedure whose outcome is uncertain ffl toss a coin ffl throw a die ffl buy a lottery

More information

Tom Salisbury

Tom Salisbury MATH 2030 3.00MW Elementary Probability Course Notes Part V: Independence of Random Variables, Law of Large Numbers, Central Limit Theorem, Poisson distribution Geometric & Exponential distributions Tom

More information

Basic Probability. Introduction

Basic Probability. Introduction Basic Probability Introduction The world is an uncertain place. Making predictions about something as seemingly mundane as tomorrow s weather, for example, is actually quite a difficult task. Even with

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

Chapter 2 Random Variables

Chapter 2 Random Variables Stochastic Processes Chapter 2 Random Variables Prof. Jernan Juang Dept. of Engineering Science National Cheng Kung University Prof. Chun-Hung Liu Dept. of Electrical and Computer Eng. National Chiao Tung

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

More on Distribution Function

More on Distribution Function More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function F X. Theorem: Let X be any random variable, with cumulative distribution

More information

1 Review of Probability

1 Review of Probability 1 Review of Probability Random variables are denoted by X, Y, Z, etc. The cumulative distribution function (c.d.f.) of a random variable X is denoted by F (x) = P (X x), < x

More information

1 Basic continuous random variable problems

1 Basic continuous random variable problems Name M362K Final Here are problems concerning material from Chapters 5 and 6. To review the other chapters, look over previous practice sheets for the two exams, previous quizzes, previous homeworks and

More information

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) 3 Probability Distributions (Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) Probability Distribution Functions Probability distribution function (pdf): Function for mapping random variables to real numbers. Discrete

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

SDS 321: Introduction to Probability and Statistics

SDS 321: Introduction to Probability and Statistics SDS 321: Introduction to Probability and Statistics Lecture 14: Continuous random variables Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin www.cs.cmu.edu/

More information

Math/Stat 352 Lecture 8

Math/Stat 352 Lecture 8 Math/Stat 352 Lecture 8 Sections 4.3 and 4.4 Commonly Used Distributions: Poisson, hypergeometric, geometric, and negative binomial. 1 The Poisson Distribution Poisson random variable counts the number

More information

1 Variance of a Random Variable

1 Variance of a Random Variable Indian Institute of Technology Bombay Department of Electrical Engineering Handout 14 EE 325 Probability and Random Processes Lecture Notes 9 August 28, 2014 1 Variance of a Random Variable The expectation

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with Example: Alternative calculation of mean and variance of binomial distribution A r.v. X has the Bernoulli distribution if it takes the values 1 ( success ) or 0 ( failure ) with probabilities p and (1

More information

Discrete Probability

Discrete Probability MAT 258 Discrete Mathematics Discrete Probability Kenneth H. Rosen and Kamala Krithivasan Discrete Mathematics 7E Global Edition Chapter 7 Reproduced without explicit consent Fall 2016 Week 11 Probability

More information

Debugging Intuition. How to calculate the probability of at least k successes in n trials?

Debugging Intuition. How to calculate the probability of at least k successes in n trials? How to calculate the probability of at least k successes in n trials? X is number of successes in n trials each with probability p # ways to choose slots for success Correct: Debugging Intuition P (X k)

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

Midterm Exam 1 Solution

Midterm Exam 1 Solution EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2015 Kannan Ramchandran September 22, 2015 Midterm Exam 1 Solution Last name First name SID Name of student on your left:

More information

Random Variables Example:

Random Variables Example: Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5 s in the 6 rolls. Let X = number of 5 s. Then X could be 0, 1, 2, 3, 4, 5, 6. X = 0 corresponds to the

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Discrete random variables and probability distributions

Discrete random variables and probability distributions Discrete random variables and probability distributions random variable is a mapping from the sample space to real numbers. notation: X, Y, Z,... Example: Ask a student whether she/he works part time or

More information

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) 3 Probability Distributions (Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) Probability Distribution Functions Probability distribution function (pdf): Function for mapping random variables to real numbers. Discrete

More information

Lecture 1: August 28

Lecture 1: August 28 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

CSE 312: Foundations of Computing II Random Variables, Linearity of Expectation 4 Solutions

CSE 312: Foundations of Computing II Random Variables, Linearity of Expectation 4 Solutions CSE 31: Foundations of Computing II Random Variables, Linearity of Expectation Solutions Review of Main Concepts (a Random Variable (rv: A numeric function X : Ω R of the outcome. (b Range/Support: The

More information

Lecture 1: Review on Probability and Statistics

Lecture 1: Review on Probability and Statistics STAT 516: Stochastic Modeling of Scientific Data Autumn 2018 Instructor: Yen-Chi Chen Lecture 1: Review on Probability and Statistics These notes are partially based on those of Mathias Drton. 1.1 Motivating

More information

Name: 180A MIDTERM 2. (x + n)/2

Name: 180A MIDTERM 2. (x + n)/2 1. Recall the (somewhat strange) person from the first midterm who repeatedly flips a fair coin, taking a step forward when it lands head up and taking a step back when it lands tail up. Suppose this person

More information

Notes for Math 324, Part 17

Notes for Math 324, Part 17 126 Notes for Math 324, Part 17 Chapter 17 Common discrete distributions 17.1 Binomial Consider an experiment consisting by a series of trials. The only possible outcomes of the trials are success and

More information

REPEATED TRIALS. p(e 1 ) p(e 2 )... p(e k )

REPEATED TRIALS. p(e 1 ) p(e 2 )... p(e k ) REPEATED TRIALS We first note a basic fact about probability and counting. Suppose E 1 and E 2 are independent events. For example, you could think of E 1 as the event of tossing two dice and getting a

More information

Lecture 4: Random Variables and Distributions

Lecture 4: Random Variables and Distributions Lecture 4: Random Variables and Distributions Goals Random Variables Overview of discrete and continuous distributions important in genetics/genomics Working with distributions in R Random Variables A

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

Random variables and expectation

Random variables and expectation Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 5: Random variables and expectation Relevant textbook passages: Pitman [5]: Sections 3.1 3.2

More information

Lecture 10. Variance and standard deviation

Lecture 10. Variance and standard deviation 18.440: Lecture 10 Variance and standard deviation Scott Sheffield MIT 1 Outline Defining variance Examples Properties Decomposition trick 2 Outline Defining variance Examples Properties Decomposition

More information

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3 Probability Paul Schrimpf January 23, 2018 Contents 1 Definitions 2 2 Properties 3 3 Random variables 4 3.1 Discrete........................................... 4 3.2 Continuous.........................................

More information

CSE 312, 2011 Winter, W.L.Ruzzo. 6. random variables

CSE 312, 2011 Winter, W.L.Ruzzo. 6. random variables CSE 312, 2011 Winter, W.L.Ruzzo 6. random variables random variables 23 numbered balls Ross 4.1 ex 1b 24 first head 25 probability mass functions 26 head count Let X be the number of heads observed in

More information

3. DISCRETE RANDOM VARIABLES

3. DISCRETE RANDOM VARIABLES IA Probability Lent Term 3 DISCRETE RANDOM VARIABLES 31 Introduction When an experiment is conducted there may be a number of quantities associated with the outcome ω Ω that may be of interest Suppose

More information

Conditional Probability

Conditional Probability Conditional Probability Idea have performed a chance experiment but don t know the outcome (ω), but have some partial information (event A) about ω. Question: given this partial information what s the

More information

Math Bootcamp 2012 Miscellaneous

Math Bootcamp 2012 Miscellaneous Math Bootcamp 202 Miscellaneous Factorial, combination and permutation The factorial of a positive integer n denoted by n!, is the product of all positive integers less than or equal to n. Define 0! =.

More information