Brief Review of Probability - PDF Free Download

Maura Department of Economics and Finance Università Tor Vergata

Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions

Outline Distribution Functions Quantiles and Modes of a Distribution A Random Variable assumes numerical values that are determined by the outcome of an experiment A Random Variable is a measurable function from Ω in R, X : Ω R If X is a measurable function, B B, X 1 (B) F, where B is a Borel measurable sets

Outline Distribution Functions Quantiles and Modes of a Distribution Discrete random variable : possible values can be counted or listed (e.g. the number of defective units in a batch of 20) Continuous random variable : may assume any numerical value in one or more intervals (e.g. the waiting time for a credit card authorization, the interest rate charged on a business loan)

Examples Distribution Functions Quantiles and Modes of a Distribution Discrete : W = number of beers randomly selected student drank last night, w = 0, 1, 2,... X = number of aspirin randomly selected student took this morning, x = 0, 1, 2,... Y = number of children in a family y = 0, 1, 2, 3,... Y = number of children in a family with children, y = 1, 2, 3,...

Examples Distribution Functions Quantiles and Modes of a Distribution Continuous : The income in a year for a family. The amount of oil imported into the U.S. in a particular month. The change in the price of a share of IBM common stock in a month. The time that elapses between the installation of a new computer and its failure. The percentage of impurity in a batch of chemicals.

Discrete Probability Distribution Distribution Functions Quantiles and Modes of a Distribution The probability distribution function (pdf), p(x), of a discrete random variable indicates the probability that X takes the value x, as a function of x. That is p(x) = P(X = x), x Let X be a discrete random variable with probability distribution function, p(x). Then i. p(x) 0 is always positive, ii. The individual probabilities sum to 1; that is n p(x i ) = 1 i=1 iii. p(x) is nonzero at most for countably many x.

Cumulative Distribution Function Distribution Functions Quantiles and Modes of a Distribution The cumulative probability function, F (x), of a random variable X expresses the probability that X does not exceed the value x, as a function of x. That is F (x) = P(X x) = p(x i ), i:x i x where the function is evaluated at all values x.

Cumulative Distribution Functions Distribution Functions Quantiles and Modes of a Distribution Definition: The cumulative distribution function, or cdf, (or just the distribution function) of a random variable X, denoted by F X (x), is defined by: F X (x) = P X (X x) x Theorem: The function F X (x) is a cdf if and only if the following conditions hold: i. The function is nondecreasing, that is, if x 1 < x 2 then F X (x 1 ) F X (x 2 ); ii. lim x F X (x) = 0; iii. lim x + F X (x) = 1; iv. F X (x) is right-continuous.

Distribution Functions Quantiles and Modes of a Distribution Most famous discrete Several problems boil down to calculating the density p(x) of some random variable. There are few distributions that tend to arise in different contexts and applications, and that are building blocks for more complex problems. Common distributions: Bernoulli, Geometric, Discrete Uniform Poisson, Binomial, Pascal, Hypergeometric

Continuous Distribution Functions Quantiles and Modes of a Distribution It is not possible to talk about the probability of a continuous random variable assuming a particular value P(X = a) = 0 Instead, we talk about the probability of a continuous random variable assuming a value within a given interval. The probability of the random variable assuming a value within some given interval from x 1 to x 2 is defined to be the area under the graph of the probability density function between x 1 and x 2.

Probability Density Function Distribution Functions Quantiles and Modes of a Distribution The probability density function or pdf, f X (x), of a continuous random variable X, is a function such that: 1 f (x) 0 2 3 f (x)dx = 1 f (x)dx = P(a < X < b) a b Remark: P(a < X < b) = P(a X < b) = P(a < X b) = P(a X b)

Cumulative Distribution Function Distribution Functions Quantiles and Modes of a Distribution Remark: A random variable X is continuous if F X (x) is a continuous function of x. A random variable X is discrete if F X (x) is a step function of x. Theorem: The random variables X and Y are identically distributed if and only if the corresponding c.d.f. are equal for every x, i.e. F X (x) = F Y (x), x

Distribution Functions Quantiles and Modes of a Distribution Most famous continuous As in the discrete case, we have some continuous random variables that are very common, and deserve to be looked at carefully. Common distributions: Uniform, Exponential, Gamma, Normal, Cauchy and Pareto

Quantiles of a Distribution Distribution Functions Quantiles and Modes of a Distribution Let X be an r.v. with d.f. F and consider a number p such that 0 < p < 1. A p th quantile of the r.v. X, or of its d.f. F, is a number denoted by x p and having the following property: P(X x p ) p and P(X x p ) 1 p For p = 0.25 we get a quartile of X, or its d.f., and for p = 0.5 we get a median of X, or its d.f.

Quantiles of a Distribution: Example Distribution Functions Quantiles and Modes of a Distribution Example Let X be an r.v. distributed as U(0, 2) and let p = 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80 and 0.90. Determine the respective x 0.10, x 0.20, x 0.30, x 0.40, x 0.50, x 0.60, x 0.70, x 0.80 and x 0.90, x 0.10, x 0.10

Quantiles of a Distribution: Example Distribution Functions Quantiles and Modes of a Distribution

Quantiles of a Distribution: Example Distribution Functions Quantiles and Modes of a Distribution F (x) = x 2 0 x 2 F (x p ) = p iif x p 2 = p x 0.1 = 0.2 x 0.2 = 0.4 x 0.5 = 1 x 0.7 = 1.4 x 0.8 = 1.6 x p = 2p

Expected value Outline Example What to expect from a random variable? The probability distribution function or the density of a random variable X provides the most comprehensive information about its possible values. Nonetheless sometimes we want to summarize a random variable with a reference number Several reasonable candidates: The number which is most likely (i.e. it maximizes the density). It is unique? This is the mode. The number for which P(X < x) = P(X > x) = 0.5. Does it exist? Is it unique? This is the median. The weighted average of the values of X. Does it exist? This is the expectation.

Example Expected value for Discrete Random Variable A r.v. X on a discrete Ω has finite expectation if: X (ω) P(ω) = x p X (x) < ω Ω x R In this case, the expectation of X is defined as: E(X ) = X (ω)p(ω) = xp X (x) ω Ω x R Not all have finite expectation.

Example Expected value for Continuous Random Variable A r.v. X with density f X has finite expectation if: t f X (t)dt < In this case, the expectation of X is defined as: E(X ) = tf X (t)dt

Variance Outline Example If X 2 has finite expectation, we define the second moment of X as E(X 2 ) and the variance of X as: Var(X ) = E (X E(X )) 2 = E ( X 2) (E(X )) 2 Expectation is a point estimate, but does not account for the dispersion of values. On the contrary, variance is insensitive to the absolute position of the random variable, but it measures their dispersion.

Expected Value Outline Example Linear: If X has finite expectation, and a, b R, then also ax + b has finite expectation and: E(aX + b) = ae(x ) + b Linearity of variance: if X have finite expectation, and a, b R, then also ax + b have finite expectation and: Var(aX + b) = a 2 Var(X ) Monotonic: if X (ω) Y (ω) for all ω and both have finite expectation, then: E(X ) E(Y )

More on the Expected Value Example What is the constant c which minimizes the distance from a given random variable X? We have: E(X c) 2 = Var(X c)+(e(x c)) 2 = Var(X )+(E(X ) c) 2 hence the minimum is obtained by choosing c = E(X ). Expectation is the best constant predictor of a random variable.

Moments Outline Example Definition: Moments of order r of random variable X is defined as: E(X r ) = x r p(x) x E(X r ) = x r f (x)dx Definition: The moment generating function of a random variable X with cumulative distribution function F X (x), denoted by M X (t), is M X (t) = E (e ) tx provided that the expectation exists for t in some neighborhood of 0.

Moments Outline Example Theorem: If X has moment generating function M X (t) then where we define E (X n ) = M (n) X (0) M (n) X (0) = d n dt n M X (t) 0 That is: the n-th moment is equal to the nth derivative of M X (t) evaluated at t = 0.

Moments Outline Example Proof: Assuming that we can differentiate under the integral sign, we have: d dt M X (t) = d dt = e tx f X (x)dx = ( ) d f X (x)dx = dt etx ( = xe tx ) f X (x)dx = = E (Xe ) tx

Example: Pareto Distribution Example A random variable X has a Pareto distribution with parameters α and x 0 f (x; α, x 0 ) = α x α 0 x α+1 for x x 0. f (x; α, x 0 ) = α x α 0 x α+1 x x 0 The cumulative distribution function is the following: F (x) = 1 x α 0 x α x x 0

Example: Pareto Distribution Example A random variable X has a Pareto distribution with parameters α and x 0 f (x) = α x 0 α x α+1 x x 0 Verify the following: { α 1 E(X ) = α α 1 x 0 α > 1 Var(X ) = { α 2 α (α 1) 2 (α 2) x 2 0 α > 2

Example: Pareto Distribution Example Pareto originally used this distribution to describe the allocation of wealth among individuals since it seemed to show rather well the way that a larger portion of the wealth of any society is owned by a smaller percentage of the people in that society. This idea is sometimes expressed more simply as the Pareto principle or the 80-20 rule which says that 20% of the population owns 80% of the wealth.

Example: Pareto Distribution Example It can be seen from the probability density function (pdf), that the probability or fraction of the population f (x) that owns a small amount of wealth per person (x) is rather high, and then decreases steadily as wealth increases. This distribution is not limited to describing wealth or income distribution, but to many situations in which an equilibrium is found in the distribution of the small to the large.

Example Example: Pareto Distribution: probability density function

Example Example: Pareto Distribution: cumulative distribution function 1 0.9 0.8 0.7 0.6 F(x) 0.5 0.4 0.3 0.2 a=1 a=2 a=3 0.1 0 0 1 2 3 4 5 6 7 8 9 10 x

Distribution of function of r.v. Example If X is a random variable with c.d.f. F X (x), then any function of X, say g(x ), is also a random variable. Theorem: Let X have cdf F X (x), let Y = g(x ), g is a monotone function, then If g is an increasing function on X, F Y (y) = F X (g 1 (y)) for y Y. If g is a decreasing function on X and X is a continuous random variable, F Y (y) = F X (g 1 (y)) for y Y X = {x : f x (x) > 0}, and Y = {y : y = g(x) some x X }

Distribution of function of r.v. Example Theorem: Let X have pdf f X (x), let Y = g(x ), g is a monotone function, then the probability distribution function of Y is given by: { f f Y (y) = X (g 1 (y)) d dy g 1 (y) y Y 0 otherwise

Exercise Outline Example Exercise: Let X have a continuous cumulative distribution function F X (x) and define the random variable Y as Y = F X (X ). Show that Y is uniformly distributed on (0,1) Proof: P(Y y) = P(F X (X ) y) = P ( F 1 X (F X (X )) F 1 X (y)) = = P(X F 1 X (y)) = F ( X F 1 X (y)) = y

Exercise Outline Example

The Bernoulli distribution Definition: A random variable X is Bernoulli with parameter π [0, 1] if P(X = 0) = 1 π and P(X = 1) = π. A Bernoulli r.v. takes only the values 0 and 1. It basically models all phenomena with two possible outcomes only (i.e. any biased coin) Verify E(X ) = π and Var(X ) = π(1 π).

Binomial distribution Definition: A random variable X is Binomial with parameters n and π [0, 1], (shortly, X Binomial(π, n)) if P(X = n) = ( n x ) π x (1 π) n x It represents probability of x successes in n experiments independent and identical. Later... relation with Bernoulli... Verify E(X ) = nπ and Var(X ) = nπ(1 π).

Discrete Uniform Outline Definition: A random variable X is discrete uniform with parameter n if: P(X = k) = 1 n, for k = 1,..., n Example: You roll a die. The distribution of the face is discrete uniform with parameter 6. A discrete uniform with parameter 2 is a Bernoulli with parameter 1/2. Verify E(X ) = n+1 2 and Var(X ) = n2 1 12.

Poisson Distribution Definition A r. v. X is Poisson with parameter λ > 0 (X Poisson(λ)) if: λ λk P(X = k) = e k! for k = 0, 1,... Where does this formula come from? Plug p = λ/n in the binomial density. Then take the limit as n. The Poisson distribution is useful to model the number of occurrences of a rare event over many trials. Example: The number of car accidents in a particular street in Rome in a given day. The probability that a given car has an accident is very small, but there are a lot of cars Verify E(X ) = λ and Var(X ) = λ.

Uniform Distribution Definition A random variable X is uniform on [a, b] if: f X (x) = { 1 b a if x [a, b] 0 otherwise A uniform r.v. takes only values in [a, b]. The probability of an interval [c, d] (with a c d b) is proportional to its length (justifying the name uniform ). When you ask your computer to generate random numbers, usually it generates uniform r.v. on [0, 1]. E(X ) = a+b 2 and Var(X ) = (b a)2 12

Exponential Distribution Definition: A random variable X is exponential with parameter λ (shortly, X Exp(λ)) if: { λ exp( λx) if x 0 f X (x) = 0 otherwise The exponential distribution models an arrival time which can take any positive real value. The higher λ, the more it is concentrated near zero. Verify E(X ) = 1 λ and Var(X ) = 1 λ

Gamma Distribution Definition A random variable X has the Gamma distribution with parameters (α, β), (shortly X Γ(α, β)) if: { 1 f X (x) = Γ(α)β x α 1 exp( x/β) if x 0, α > 0, β > 0 α 0 otherwise Γ(α) is a normalizing constant defined by: Γ(n) = 0 λ n t n 1 exp( λt)dt and satisfies: Γ(n) = (n 1)! for n = 1, 2,.... The parameter α is known as the shape parameter, since it most influences the peakedness of the distribution, while the parameter β is called the scale parameter, since most of its influence is on the spread of the distribution. Verify E(X ) = αβ and Var(X ) = αβ 2.

The Normal distribution Definition A random variable X has the Normal or Gaussian distribution with parameters (µ, σ), (shortly X N (µ, σ 2 )) if: f X (x) = 1 (x µ)2 exp( 2πσ 2σ 2 ). If X N (0, 1), X is a standard normal If X N (µ, σ), ax + b N (aµ + b, a 2 σ 2 ) is a standard normal Use TABLES! Later other properties...

Exponential Family Outline A family of probability distribution function is called an exponential family if it can be expressed as: ( n ) f (x θ) = h(x)c(θ)exp ω i (θ)t i (x) Here h(x) 0 and t 1 (x),..., t k (x) are real-valued functions of the observation x (they cannot depend on θ), and c(θ) > 0 and ω 1 (θ),..., ω k (θ) are real-valued functions of the possibly vector-valued parameter θ (they cannot depend on x). Many common distributions belong to this family: gaussian, poisson, binomial, exponential, gamma and beta. (PROOF TO BE DONE!) i=1

Moment Generating Function: Bernoulli Let X Bernoulli(π) M X (t) = e t 0 (1 π) + e t 1 (π) M X (t) = πe t + (1 π)

Moment Generating Function: Binomial Let X Binomial(π, n) M X (t) = = n ( e tx n x 0 n 0 ( n x M X (t) = ( πe t + (1 π) ) n ) π x (1 π) n x ) (e t π ) x (1 π) n x Remember: n x=0 ( n x ) u x (v) n x = (u + v) n

Moment Generating Function: Gamma Let X Γ(α, β) M X (t) = = = M X (t) = 0 0 e tx ( 1 1 βt Γ(α)β α x α 1 exp( x/β)dx 1 Γ(α)β α x α 1 exp ( x (1/β t)) dx ) α ( ( 1 ) α x α 1 exp x 0 Γ(α) ( 1 ) α 1 βt ( β 1 βt 1 β 1 βt )) dx Moment generating function for Gamma distribution exists only if t 1/β

Moment Generating function: Gaussian Let X N(µ, σ 2 ) M X (t) = E = ( ) e tx = 1 2πσ exp 1 (x µ)2 exp(tx) exp ( 2πσ 2σ ( 2 ( x + µ 2 2xµ 2σ 2 tx ) ) 2σ 2 ( (σ 4 t 2 + 2µσ 2 ) t) exp( (x (µ+σ2 t)) 2 = exp 2σ 2 2πσ = exp(σ 2 t 2 /2 + µt) 2σ 2 ) ) dx dx dx