A Journey Beyond Normality - PDF Free Download

Department of Mathematics & Statistics Indian Institute of Technology Kanpur November 17, 2014

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Quotation 1: Few Famous Quotations Normality is a myth; there never was, and there never will be a normal distribution, by Roy C. Geary (1947; Biometrika, vol. 34, 248)

Quotation 2: Few Famous Quotations Everybody believes in the exponential law of errors [i.e. the normal distribution], the experimenters, because they think it can be proved by mathematicians; and the mathematicians, because they believe that it has been established by observations, E.T. Whittaker and G. Robinson (1967)

Quotation 2: Few Famous Quotations... the statisticians knows... that in nature there never was a normal distribution, there never was a straight line, yet with normal and linear assumptions, known to be false he can often derive results which match to a useful approximation, those found in real world, George W. Box (1976, Jour. Amer. Stat. Asso., vol. 71, 791-799).

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Definition Few Famous Quotations A random variable X is said to be normally distributed with mean µ and variance σ 2, if the probability density function of X is the following (for < µ < and σ > 0) f(x;µ,σ) = 1 2πσ e (x µ)2 2σ 2 ; < x <.

Definition Few Famous Quotations In other words: For any a and b P(a < X < b) = b a f(x;µ,σ)dx.

The Probability Density Function 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 10 5 0 5 10

How it has started? It all had started from Gambling

Gambling Question A 17 th century gambler, the Chevalier de Mere, asked Pascal for an explanation of his unexpected losses in gambling

Two Famous Questions 1 What is the probability of having at least one 1, in four rolls of a dice? (6 4 5 4 )/6 4 > 0.5 2 What is the probability of having at least one double in 1 in 24 rolls of two die? (36 24 35 24 )/36 24 < 0.5

Binomial Sum The famous correspondence between Pascal and Fermat was instigated in 1654, and they were mainly interested to calculate the following binomial sum: j k=i ( ) n p k (1 p) n k k The problem was not difficult when n is small.

Binomial Sum Within few years the following problem arises in a sociological study, where the following computation was necessary:n = 11,429, i = 5745, j = 6128 j k=i ( ) n p k (1 p) n k k

Original Problem The problem is to test the hypothesis that male and female births are equally likely against the actual birth in London over 82 years from 1629-1710. It is observed that the relative number of male births varies from a low of 7765/15,448 = 0.5027 in 1703 to a high of 4748/8855 = 0.5362 in 1661. 11,429 is the average number of births in London over 82 years, and 5745 and 6128 are two limits.

Solution Few Famous Quotations Using the following recurrence relation ( ) ( )( ) n n n x = x +1 x x +1 and some involved rational approximation it has been obtained P(5747 X 6128 p = 1/2) = 6128 i=5745 0.292 ( 11,429 i )( 1 2 ) i

Break Through De Moivre began the search for this approximation in 1721, and in 1733 it has been proved that and ( n n +x 2 )( 1 2 ) n 2 2πn e 2x2 /n x n/2 a ( n x )( 1 2 ) n 4 a/ n 2π 0 e 2y2 dy.

Normal Approximation Eventually using the second approximation one gets j k=i where ( ) ( n p k (1 p) k j np Φ ) Φ k np(1 p) Φ(z) = 1 2π z e x2 /2 dx. ( ) i np np(1 p)

Error Modeling Gauss (1809) made the following assumptions and deduce the normal distribution as an error distribution: 1 Small errors are more likely than large errors. 2 For any real numbers ǫ, the likelihood of errors of magnitudes ǫ and ǫ are equal. 3 In the presence of several measurements of the same quantity, the most likely value of the quantity being measured is their average.

Central Limit Theorem Lindberg-Levy CLT: Suppose {X 1,X 2, } is a sequence of independent identically distributed random variables with mean µ and variance σ 2 <, then as n ( ) n 1 n X i µ N(0,1). σ n i=1

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Drawbacks Few Famous Quotations What will happen if the data indicate that the parent distribution 1 is not symmetric? 2 is heavy tail? 3 is not unimodal?

Not Symmetric The support of the corresponding random variable is the whole real line and the probability density function does not satisfy f(µ x) = f(µ+x).

Heavy Tail Few Famous Quotations For large x P(X > x) > O(1 Φ(x)) O(e x2 /2 /x)

Multimodal Few Famous Quotations 2.5 2 1.5 1 0.5 0 10 5 0 5 10 15 20

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Goal Few Famous Quotations 1 Generate a non-symmetric class of distributions which have support on the whole real line. 2 Normal distribution is a special member. 3 It should not have too many parameters.

Construction Few Famous Quotations Suppose X and Y are two independent standard normal random variables, and λ is any real number. Therefore P(X < λy) = P(X λy < 0) = 1 2, as X λy is a normal random variable with mean 0, and variance 1+λ 2.

Construction Few Famous Quotations On the other hand where P(X < λy) = Φ(λy)φ(y)dy, φ(x) = 1 2π e x2 /2, and Φ(x) = x φ(u)du.

Construction Few Famous Quotations Therefore, 1 2 = Φ(λy)φ(y)dy. Since Φ(λy)φ(y) 0, the function f(x;λ) = 2φ(x)Φ(λx) is a proper probability density function, and it is called skew-normal probability density function with parameter λ and we will denote it by SN(λ).

Shapes of the PDF SN 0.8 0.7 0.6 0.5 λ = 5 0.4 0.3 0.2 0.1 λ = 7 0 4 2 0 2 4

Some Properties 1 The SN(0) density is the N(0,1) density. 2 As λ, f(x;λ) 2 π e x2 /2 ; x > 0 3 If Z is a SN(λ) random variable, then Z is a SN( λ) random variable. 4 The PDF of a SN(λ) random variable is unimodal. 5 If Z is SN(λ) then Z 2 is χ 2 1.

Moment Generating Function and Moments Based on the following well known results; If U is a N(0,1) random variable then for any h and k E(Φ(hU +k)) = Φ(k/ 1+h 2 ), the moment generating function of Z can be obtained as where δ = λ/ 1+λ 2. M(t) = E(e tz ) = 2e t2 /2 Φ(δt),

Moments Few Famous Quotations From the moment generating function different moments can be easily obtained. For example where b = 2/π. E(Z) = bδ and V(Z) = 1 (bδ) 2,

Data Analysis Few Famous Quotations For data analysis purposes three-parameter skew normal distribution can be easily defined with the probability density function as follows: f(x;µ,σ,λ) = 2 σ φ ( x µ σ ) Φ ( ) λ(x µ) σ

Maximum Likelihood Estimators: Based on a random sample X 1,X 2,...,X n, the problem is to estimate the unknown parameters µ, σ and λ. We can obtain those estimates by maximizing the likelihood function: L(µ,σ,λ) = n f(x i ;µ,σ,λ) i=1

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Definition Few Famous Quotations The random variable X is said to have a power normal distribution if for any α > 0, the cumulative distribution function of X can be written as F(x) = P(X x) = [Φ(x)] α ; < x <.

Definition Few Famous Quotations Observe that F(x) is a proper distribution function: 1 F(x) is a non-decreasing function in (, ). 2 F(x) is a right continuous function. 3 lim x F(x) = 0 and lim x F(x) = 1.

Probability Density Function The corresponding probability density function can be written as f(x;α) = αφ(x)[φ(x)] α 1 ; < x <.

Shapes of the Power Normal PDF 0.6 0.5 0.4 0.3 0.2 0.1 0 10 5 0 5 10

Some Important Observations 1 Normal distribution is a special case. 2 The probability density function is always unimodal. 3 It can be both positively and negatively skewed. 4 When α is an integer different moments can be expressed in explicit forms.

Data Analysis Few Famous Quotations For data analysis purpose three parameter power normal density function can be easily defined as follows for < µ <, σ > 0 and α > 0; f(x;µ,σ,α) = α ( )[ x µ σ φ Φ σ ( )] α 1 x µ. σ

Maximum Likelihood Estimator The maximum likelihood estimators of the unknown parameters can be obtained by maximizing L(µ,λ,α) = n f(x i ;µ,σ,α) i=1

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Definition Few Famous Quotations Suppose N is a geometric random variable with the following probability mass function for 0 < p 1 (with the assumption 0 0 = 1. P(N = k) = p(1 p) k 1 ; k = 1,2,...

Definition Few Famous Quotations X 1,X 2,..., is a sequence of independent identically distributed normal random variables with mean µ and variance σ 2. Consider the following random variable X = N X i. i=1 Therefore, X is a random sum of independent identically distributed normal random variables. We call X has the geometric skew normal distribution with parameters p, µ and σ. Notation GSN(µ, σ, p).

Joint Cumulative Distribution Function The joint cumulative distribution function of X and N becomes P(X x,n n) = = = p n P(X x,n = k) k=1 n P(X x N = k)p(n = k) k=1 n ( ) x kµ Φ σ (1 p) k 1 k k=1

Cumulative Distribution Function and Probability Density Function The cumulative distribution function of X becomes P(X x) = F(x) = P(X x,n < ) ( ) x kµ = p Φ σ (1 p) k 1. k k=1 Hence the PDF of X becomes f(x) = d dx F(x) = p k=1 1 σ k φ ( x kµ σ k ) (1 p) k 1.

Some Basic Properties 1 Normal distribution can be obtained as a special case. 2 It can be both positively and negatively skewed. 3 It can have heavy tails. 4 It can be multimodal also.

Shapes of the PDF 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 10 5 0 5 10 15 20 25 30 35

Shapes of the PDF 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 30 25 20 15 10 5 0 5 10 15

Shapes of the PDF 0.12 0.1 0.08 0.06 0.04 0.02 0 10 0 10 20 30 40 50 60

Shapes of the PDF 0.14 0.12 0.1 p = 0.5 0.08 0.06 0.04 0.02 0 p = 0.1 p = 0.2 40 20 0 20 40

Moment Generating Function M X (t) = Ee tx = E[E(e tx N)] = E [ ] e Nµt+σ2 Nt 2 2 = pe (µt+σ2 t 2 ) 2. 1 (1 p)e (µt+σ2 t 2 ) 2 Immediately E(X) = µ p and V(X) = σ2 p +µ 2 (1 p) p 2.

Interesting Theoretical Properties 1 It is infinitely divisible. 2 It is geometric stable.

Outline Few Famous Quotations 1 Few Famous Quotations 2 3 4 5 6 7

Definition Few Famous Quotations The natural multivariate generalization becomes: X = Here X i s are independent identically distributed multivariate normal random variables. N i=1 X i

Shapes of the joint PDF 0.12 0.1 0.08 0.06 0.04 0.02 0 8 6 4 2 0 2 4 6 8 8 6 4 2 0 2 4 6 8

Shapes of the joint PDF 0.25 0.2 0.15 0.1 0.05 0 2 3 4 5 6 7 8 9 10 11 12 2 3 4 5 6 7 8 9 1011 12

Shapes of the joint PDF 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0 2 0 2 4 6 8 10 2 0 2 4 6 8 10

Estimation Few Famous Quotations Even in the multivariate case the maximum likelihood estimators of the unknown parameters can be obtained quite efficiently by using EM algorithm.

References Few Famous Quotations 1 A. Azzalini (1985), A class of distributions which includes the normal ones, Scandinavian Journal of Statistics, vol. 12, 171-178. 2 Saul Stahl (2006), The evaluation of normal distribution, Mathematics Magazine, vol. 79, no. 2, 96-113.

Thank You