Lecture 8: Continuous random variables, expectation and variance

Lecture 8: Continuous random variables, expectation and variance Lejla Batina Institute for Computing and Information Sciences Digital Security Version: autumn 2013 Lejla Batina Version: autumn 2013 Wiskunde 1 1 / 25

Outline Continuous Random Variables and Probability Distributions Lejla Batina Version: autumn 2013 Wiskunde 1 2 / 25

Random variable Definition Let S be a sample space and A is an event from S. A random variable is a real function defined on a S, f : S R. A random variable that takes on a finite or a countably infinite number of values is called a discrete random variable, otherwise we have a non-discrete or continuous random variable. Lejla Batina Version: autumn 2013 Wiskunde 1 3 / 25

Probability distribution Definition (Discrete probability distributions) Let X be a discrete random variable obtaining values x 1, x 2,..., to which we assign probabilities P(X = x k ) = f (x k ), k = 1, 2,.... The probability function (or probability distribution) is given by: P(X = x) = f (x); for x = x k we get f (x k ) and for x x k f (x) = 0. f (x) is a probability function if: 1 f (x) 0. 2 x f (x) = 1. Lejla Batina Version: autumn 2013 Wiskunde 1 4 / 25

Distribution functions Definition For a random variable X, the (cumulative) distribution function is defined by: F (x) = P(X x) for x R. The distribution function F (x) has the following properties: F (x) F (y) if x y. lim x F (x) = 0 and lim x + F (x) = 1. lim h 0 +F (x + h) = F (x), x, i.e. F (x) is continuous from the right. Lejla Batina Version: autumn 2013 Wiskunde 1 5 / 25

Distribution functions for discrete random variables Let X be a discrete random variable X, which takes on the values x 1, x 2,..., x n. Then: F (x) = P(X x) = u x f (u) = x i x p i, in more detail 0, < x < x 1, F (x) = f (x 1 ), x 1 x < x 2, f (x 1 ) + f (x 2 ), x 2 x < x 3,... f (x 1 ) + f (x 2 ) +... + f (x n ), x n x <. Lejla Batina Version: autumn 2013 Wiskunde 1 6 / 25

Example Example A coin is tossed twice, then a sample space is S = {HH, HT, TH, TT }, and X - number of heads. X 0 1 2 f (x) 1/4 1/2 1/4 The distribution function F (x): 0, < x < 0, 1/4, 0 x < 1, F (x) = 3/4, 1 x < 2, 1, 2 x <. Lejla Batina Version: autumn 2013 Wiskunde 1 7 / 25

Distribution functions for non-discrete random variables Definition (Distribution function for continuous random variables) A non-discrete random variable is continuous if its distribution function F can be represented as: F (x) = P(X x) = x f (u)du for < x <, where the function f has the following properties: f (x) 0, f (x)dx = 1. f is called the probability density function of the random variable X. df (x) It is evident that f (x) = dx and F (a) := P(X a) = a f (x)dx. P(x 1 < X x 2 ) = F (x 2 ) F (x 1 ) = x 2 x 1 f (x)dx, so the area under f (x) in the interval (x 1, x 2 ) represents the probability that the random variable X lies in the interval (x 1, x 2 ) as above. Lejla Batina Version: autumn 2013 Wiskunde 1 8 / 25

Some properties Theorem 1 F (+ ) = 1 and F ( ) = 0 Proof: F (+ ) = P(X + ) = P(S) = 1. 2 F is a non-decreasing function of x so: x 1 < x 2 F (x 1 ) F (x 2 ). 3 If F (x 0 ) = 0 F (x) = 0, x x 0. 4 P(X > x) = 1 F (x) 5 F (x) is continuous from the right so F (x + ) = F (x). 6 P(x 1 < X x 2 ) = F (x 2 ) F (x 1 ). Lejla Batina Version: autumn 2013 Wiskunde 1 9 / 25

Uniform distribution X is said to be uniformly distributed in (a, b), < a < b <, if its density function is: 1 f (x) = b a, a x b, 0, otherwise The distribution function of X is given by: 0, x < a, x a F (x) = b a, a x b, 1, x > b F (x) = x a f (x)dx = x a Also, f (x)dx = b a 1 b a dx = 1 1 b a b a x x a = x a dx = 1. b a. Lejla Batina Version: autumn 2013 Wiskunde 1 10 / 25

Normal (Gaussian) distribution It is one of the most commonly used probability distribution for applications. When an experiment is repeated numerous times then the random variable representing the average or mean tends to have a normal distribution as the number of experiments becomes large. This fact is also known as the central limit theorem and it is very important for many statistical techniques. Many physical values follow this distribution e.g. heights, weights. Also often used in social sciences and for grades, errors, etc. Lejla Batina Version: autumn 2013 Wiskunde 1 11 / 25

Normal distribution:definition Definition We say that X is a normal or Gaussian random variable with parameters µ and σ (and we write X N(µ, σ)) if its density function is given by: f (x) = 1 deviation. (x µ) 2 σ e 2σ 2 2Π The distribution function is then given by: F (x) = P(X x) = 1 σ 2Π, where µ and σ are the mean and standard (v µ)2 e 2σ 2 dv The integral cannot be computed exactly, so we use tables of cumulative probabilities for a special normal distribution to calculate the probabilities. Lejla Batina Version: autumn 2013 Wiskunde 1 12 / 25

The normal distribution: source Wikipedia The parameter µ determines the location of (the axe of symmetry of) the distribution while σ determines the width of the curve. Lejla Batina Version: autumn 2013 Wiskunde 1 13 / 25

Standardizing normal distribution N(0, 1) is often called the standard normal distribution. Theorem If X is a normal random variable with mean µ and standard deviation σ, then Z = X µ σ is a standard normal random variable. Lejla Batina Version: autumn 2013 Wiskunde 1 14 / 25

Mathematical Expectation Definition For a discrete random variable X having the possible values x 1, x 2,..., x n the expectation of X is defined as: E(X ) = x 1 P(X = x 1 ) + x 2 P(X = x 2 ) +... + x n P(X = x n ) = = n j=1 x j P(X = x j ). As a special case, if all the probabilities are equal, we get: E(X ) = x 1+x 2 +...+x n n, which is called the arithmetic mean of x 1, x 2,..., x n. For a continuous random variable X having density function f (x), the expectation of X is defined as: E(X ) = xf (x)dx provided that the integral converges. Lejla Batina Version: autumn 2013 Wiskunde 1 15 / 25

History 1/2 The problem of points, also called the problem of division of the stakes: Consider a game of chance with two players who have equal chances of winning each round. The players contribute equally to a prize pot, and agree in advance that the first player to have won a certain number of rounds will collect the entire prize. What happens if the game is interrupted by external circumstances before either player has won? How does one then divide the pot fairly? Lejla Batina Version: autumn 2013 Wiskunde 1 16 / 25

History 2/2 It is expected that the division should depend somehow on the number of rounds won by each player, such that a player who is close to winning gets a larger part of the pot. But the problem is not merely one of calculation; it also includes the explanation on a fair division. Lejla Batina Version: autumn 2013 Wiskunde 1 17 / 25

Examples Example In a lottery there are 200 prizes of 5 Euros, 20 of 25 Euros, 5 of 100 Euros. Assuming that 10000 tickets will be issued and sold, what is a fair price to pay for a ticket. X - ran. var. denoting the amount of money to be won by a ticket X 5 25 100 0 P(X = x) 0.02 0.002 0.0005 0.9775 P(X = 5) = 200 10 000 = 0.02 P(X = 25) = 20 10 000 = 0.002 P(X = 100) = 5 10 000 = 0.0005. E(X ) = 5 0.02 + 25 0.002 + 100 0.0005 + 0 = 0.2, so a ticket should cost 20 cents. Lejla Batina Version: autumn 2013 Wiskunde 1 18 / 25

Example (Cauchy distribution) Let X be a random variable with the following density function: f (x) =, < x <. C x 2 +1 Find the value of C. Find the probability that X 2 lies between 1 3 and 1. The condition is f (x)dx = 1. Then we have: f (x)dx = Cdx x 2 +1 = C lim B dx B B x 2 +1 = = C lim B arctan B B = C[π 2 ( π 2 )] = Cπ C = 1 π. When 1 3 x 2 1 1 x 3 3 or 3 3 x 1. Then P( 1 3 x 2 1) = P( 1 x 3 3 ) + P( 3 3 x 1) = = 1 π 3 3 1 dx x 2 +1 + 1 π 1 3 3 dx x 2 +1 = 1 6. Lejla Batina Version: autumn 2013 Wiskunde 1 19 / 25

Example A continuous random variable X has probability density function given by: { 2e 2x, x > 0, f (x) = Find E(X ). 0, x 0, E(X ) = xf (x)dx = 0 x2e 2x dx = 2 0 xe 2x dx = 2I 0. I = xe 2x dx = [u = x, du = dx, v = e 2x dx = 1 2 e 2x ] = = x 2 e 2x + 1 2 e 2x dx = x 2 e 2x 1 4 e 2x E(X ) = 2[ x 2 e 2x 1 4 e 2x ] 0 = 2[0 1 4 0 + 0 + 1 4 ] = 1 2. Lejla Batina Version: autumn 2013 Wiskunde 1 20 / 25

Some properties For X and Y random variables, we have: E(X + Y ) = E(X ) + E(Y ) E(αX ) = αe(x ), α R If X and Y are independent random variables, then: E(XY ) = E(X ) E(Y ). Lejla Batina Version: autumn 2013 Wiskunde 1 21 / 25

Variance and standard deviation Definition (Variance) Let X be a random variable with mean µ. Then the value: σ 2 = Var(X ) = E[(X µ) 2 ] = n j=1 (x j µ) 2 f (x j ) represents the average square deviation of X around its mean. This value is called the variance of the random variable X. In the special case where all the probabilities are equal, we have: σ 2 = E[(x 1 µ) 2 +(x 2 µ) 2 +...(x n µ) 2 ] n. For a continuous variable X with a density function f (x): σ 2 = (x µ)2 f (x)dx. The value σ = E[(X µ) 2 ] is called the standard deviation of X. The variance is a measure of the dispersion or scatter of the values (of the random variable considered) around the mean µ. Lejla Batina Version: autumn 2013 Wiskunde 1 22 / 25

Theorems on Variance Theorem 1 Var(X ) = 0 X = C, C R. 2 Var(αX ) = α 2 Var(X ) 3 Var(X + C) = Var(X ) 4 Var(X ) = E(X 2 ) E(X ) 2 5 If X and Y are independent random variables then: Var(X + Y ) = VarX + VarY Proof of 4. Var(X ) = E[(X µ) 2 ] = E[X 2µX +µ 2 ] = E[X 2 ] 2µE[X ]+µ 2 = = E[X 2 ] 2µ 2 + µ 2 = E[X 2 ] E[X ] 2. Lejla Batina Version: autumn 2013 Wiskunde 1 23 / 25

Expectation and variance of the uniform distribution Example µ = xf (x)dx = b xdx a b a = 1 x 2 b a 2 b a = a+b 2. E(X 2 ) = x 2 f (x)dx = b a x 2 dx b a = 1 3(b a) x 3 b a = b3 a 3 3(b a) = = a2 +ab+b 2 3. Then it follows: Var(X ) = E(X 2 ) E(X ) 2 = a2 +ab+b 2 3 1 4 (a2 + 2ab + b 2 ) = = 1 12 (a b)2. Lejla Batina Version: autumn 2013 Wiskunde 1 24 / 25

Example Let Z = X µ σ. Find the expectation and variance of Z. E(Z) = E( X µ σ ) = 1 σ [E(X µ)] = 1 σ [E(X ) µ] = [E(X ) E(X )] = 0, since E(X ) = µ. = 1 σ Var(Z) = Var( X µ σ ) = 1 E[(X µ) 2 ] = 1, since σ 2 E[(X µ) 2 ] = σ 2. Lejla Batina Version: autumn 2013 Wiskunde 1 25 / 25