0 Lecture 0 0. Gambler Ruin Problem Let X be a payoff if a coin toss game such that P(X = ) = P(X = ) = /2. Suppose you start with x dollars and play the game n times. Let X,X 2,...,X n be payoffs in each of the game. Your accumulated wealth after n games is S n = x + X + X 2 +... + X n. Each S n is a discrete random variable. The sequence S n, n =,2,..., is called a Random Walk. If you compute you expected accumulated wealth after n turns: E[S n ] = x, as you would expect not to gain anything from this game. Suppose that the game for you ends once you have no money left and cannot borrow. This is expressed by the condition S n = 0, to which we refer as to go bankrupt.. Problem Show that P(S n = 0, S n > 0,..., S > 0 for some n > 0) =. Suppose you have an option to stop or continue playing anytime you want, but you have to stop if you go bankrupt: S n = 0. Consider the following strategy: set a goal of N dollars. You decided to stop the game when you reach the goal, provided, of course that didn t go bankrupt before. Find the probability that you reach the goal before going bankrupt. Set p x as the probability to reach the goal before going bankrupt when we start with $x dollars. Use the Law of Total Probability to show that p x verifies a system of difference equations: p x = p x+ 2 + p x 2, < x < N p = p 2 p N = 2, 2 + p N 2 2 p 0 = 0, p N =. Solve the system of equations assuming p x = Ax + B, for some A,B. Answer: p x = x/n. Let Y be your total winnings (the return) in the above game. Find E[Y ]. Answer: E[Y ] = (N x)x/n x( x/n) = 0.
Find the probability q x to go bankrupt before reaching the goal, if you start with $x dollars. Answer: q x = x/n. What happens if N +? 0.2 Continuous Random Variables Let X be a random variable, i.e., the function from the sample space S into R. The cumulative distribution function (CDF) is defined as F X (x) = P(X x). Definition. X is a continuous random variable if F X (x) is a differentiable function. The derivative f (x) = df X dx is called the probability density function (PDF) of X. Properties of continuous random variables. PDF f (x) is a non-negative function. 2. + f (x)dx =. 3. For any a b, P(a < X b) = F X (b) F X (a) = b a f (x)dx 4. For any x, P(X = x) = 0. 5. For any a b, P(a < X < b) = P(a X b) = P(a X < b) = b a f (x)dx. Definition 2. A random variable X which is neither discrete or continuous, is called a mixed random variable. 2
0.3 Expectation and Variance. The expectation of X is defined as an integral + E[X] = x f (x)dx. 2. The variance and the standard deviations are defined as V (X) = + (x E[X]) 2 f (x)dx = σ = V (X). 3. Tschebyshev s inequality: for any k > 0, + P( X µ > k) σ 2 k 2. 4. Properties of the Expectation and Variance: x 2 f (x)dx (E[X]) 2, E[aX + b] = ae[x] + b, V (ax + b) = a 2 V (X). Problem Prove both formulas. Hint: if PDF of X is f (x) what is the PDF of Y = ax + b? Find the PDF of X and its mean E[X]. Problem (Compound Distribution). Let X be a random variable with CDF F(x), and X 2 a random variable with CDF G(x). Let X be a random variable that is defined as follows: you flip a biased coin with probability of Tails, p. If Tails you observe X, and set X = X, if Heads, set X = X 2. Find CDF of X. X is called a compound random variable. Answer: pf(x) + ( p)g(x). 0.4 Uniform Distribution X is a uniform random variable on interval [a, b] if its PDF, f (x) takes a constant value on the interval [a,b] and is zero otherwise: { a x b, f (x) = b a 0 x < a or x > b. Properties of Uniform Distribution 3
Find the mean E[x], and the variance V (X) of a uniform random variable. Answer: E[X] = b a (b a)2 2, V (X) = 2. Problems. Let X be the time (in minutes) a bus arrives at the bus stop. Suppose that X is a uniform random variable over the interval from :00pm to 2:00pm. You arrive at the bus stop at :5pm and you find out that the has not yet passed. Find the probability the bus arrives in the next 20 minutes. Hint: You need to find P(5 < X 35 X > 5). 2. Let N be Poisson Pois(λ) random variable that counts the number of customers in a store in the first hour after the store opens. Let X denote the time of arrival of the first customer. Show that conditioned on the event {N = }, P(0 < X x N = } = x, x (0,). This means that conditioned on fact that there is only one customer in the first hour, the time of her arrival is uniform random variable. Hint: think of a Poisson random variable as a binomial random variable B(n,λ/n), when n is large. Problem Suppose a man, standing at the point (0, ), shoots an arrow at the target that located at (0,0) by randomly choosing an angle from [ π/2,π/2]. Let X denote the distance along the x-axis from the origin to the point where the arrow lands. Find the PDF of this random variable. Compute E[X]. A random variable with such PDF is called Cauchy Random Variable. A random variable that has either infinite expectation or infinity variance is sometimes referred to as a heavy tail random variable (or distribution). 0.5 Exponential Random Variable Let X be a Poisson random variable Pois(λ), which counts a number of occurrences of some event in a unit of time, starting at time zero. Let T be the time of the first event occurred. For time t > 0, what is the probability that P(T > t)? One way to find this probability is the following. Let Y counts the same events in the interval [0,t]. Y is Poisson Pois(λt). This can be shown using a binomial approximation of the Poisson distribution. Then Why? Then, since P(T > t) = P(Y = 0). P(Y = 0) = e λt, P(T > t) = e λt. 4
The CDF of T : and zero otherwise. The PDF of T : exp (0.) exp f (t) = F T (t) = e λt, t > 0, { λe λt t > 0, 0 t 0. Definition 3. T with PDF ( exp 0.) is called an exponential random variable, Exp(/λ). The textbook uses a parameter θ = λ for the formula for f (t). θ has a meaning of an average waiting time, since E[T ] = θ. Problem Show that the time between i th and (i + ) th events is exponential with PDF ( 0.). exp 0.5. Properties of Exponential Distribution Problem Find the mean E[X], and the variance V (X) of an exponential random variable. Answer: E[X] = /λ, V (X) = /λ 2. Problem Exponential distribution is a unique distribution that has the property of being memoryless. For any t > s 0, Verify this property: P(T t + s T > s) = P(T t). Problem Suppose that a number of miles that a car can run before its battery wears out is exponential random variable with an average value of 0,000 miles. Find the probability that a car still operational after the first 5,000 miles. Find the probability that the battery will work after 25,000 miles given that didn t fail for the first 20,000 miles. Answer: Both probabilities are the same and equal 0.604. 0.6 Normal Distribution X is called a normal random variable with mean µ and standard deviation σ, if its PDF, (x µ)2 f (x) = e 2σ 2. 2πσ 2 Notation: X N(µ,σ). 5
Properties If X is N(µ,σ) then Y = ax + b is N(aµ + b, a σ). Standard Normal random variables Z N(0,) is called standard normal distribution. If X is N(µ,σ) then Z = X µ σ is a standard normal random variable. It is common to find the values of the probability P(0 Z z) = z 0 2π e z2 /2 dz = Π(z), to be given in the Table of the Standard Normal Variable, or programmed in a calculator. Problems Let X be a normal random var. N(0,2). Using the table values of Φ(z) determine P(8 X 3). Answer: P = 0.7745. (Prob. 5.87 from the textbook). A machine produces steel shafts where the diameter has a normal distribution with mean.005 and standard deviation 0.0 inch. Quality control requires diameters fit into the interval.00 ± 0.02 inches. What percentage of output will fail the quality control? Answer: 7.3%. Three Percentiles The following rules are often use estimate the probabilities of a normal random variable X : P(µ σ X µ + σ) = 68%, 6
P(µ 2σ X µ + 2σ) = 95%, P(µ 3σ X µ + 3σ) = 99.7%. Normal Approximation of Binomial Let X be a binomial B(n, p). E[X] = np, σ = np( p). The random variable X np np( p) is a random variable with zero mean and standard deviation. Let Z be a standard normal variable. Theorem (De Moivre-Laplace theorem). For any a b, as n +, ( ) P a X np b P(a Z b). np( p) Approximation is good when np( p) 0. Problem An unknown fraction p of a population are smokers. A random sampling with replacement of size n is taken. How large n should be so that the error in estimating p is less than 0.5% Let X be the number of smokers in the sample. X is binomial B(n, p) random variable. The estimated number of smokers is We d like to have X n X/n p 0.005 with high probability. For that, we will select a confidence level, say 0.95, and require that P( X/n p 0.005) 0.95. 7
Rewrite probability as P( X/n p 0.005) = P( 0.005n X np 0.005n) ( = P 0.005 n X np 0.005 n ). p( p) np( p) p( p) By the De-Moivre-Laplace theorem the last probability is well approximated by ( ) ( P 0.005 n Z 0.005 n 0.005 n = 2Φ ), p( p) p( p) p( p) where Z is a standard normal variable. From the table we find that 0.005 n p( p).96, and n 392 p( p). But for any p [0, ], p( p) /4. Thus, it is enough to take n 392/2 = 96, n 3846. References Bean?Bean?[] M. Bean, Probability: The Science of Uncertainty, AMS, (2009). Ross?Ross?[2] Sh. Ross, A First Course in Probability, 6 th edition, Prentice Hall (2002). F?F?[3] W. Feller, An Introduction to Probability Theory and Its Applications, vol. 3 d edition, Wiley, (968). J?J?[4] E.T. Jaynes, Probability Theory: The Logic of Science, Cambridge University Press, (2003). ShYou?ShYou?[5] R. Scheaffer and L. Young, Introduction to Probability and Its Applications, 3 d edition, Cengage Learning, (2009). 8