Dr. Kempthorne Spring 2016 1
Outline Building 1 Building 2
Definition Building Let X be a random variable/vector with sample space X R q and probability model P θ. The class of probability models P = {P θ, θ Θ} is a one-parameter exponential family if the density/pmf function p(x θ) can be written: p(x θ) = h(x)exp{η(θ)t (x) B(θ)} where h : X R η : Θ R B : Θ R. Note: By the Factorization Theorem, T (X ) is sufficient for θ p(x θ) = h(x)g(t (x), θ) Set g(t (x), θ) = exp{η(θ)t (x) B(θ)} T (X ) is the Natural Sufficient Statistic. 3
Examples Building Poisson Distribution (1.6.1) : X Poisson(θ), where E[X ] = θ. θ p(x θ) = x e θ x!, x = 0, 1,... 1 = x! exp{(log(θ)x θ} = h(x)exp{η(θ)t (x) B(θ) where: h(x) = 1 x! η(θ) = log(θ) T (x) = x 4
Examples Building Binomial Distribution (1.6.2) ( : X r Binomial(θ, n) n p(x θ) = θ 1 (1 θ) n x x = 0, 1,..., n ( x r n θ = exp{log( 1 θ )x + nlog(1 θ)} x = h(x)exp{η(θ)t (x) B(θ) where: ( r n h(x) = x θ η(θ) = log( 1 θ ) T (x) = x B(θ) = nlog(1 θ) 5
Examples Building Normal Distribution : X N(µ, σ 0 2 ). (Known variance) p(x θ) = e 2σ 0 2 where: 1 (x µ) 1 2 2πσ2 0 1 2 µ 2 2σ } 2πσ 2 0 2 σ0 2 2σ0 2 0 = [ exp{ x }] exp{ µ x = h(x)exp{η(θ)t (x) B(θ) 1 2πσ 2 2σ0 2 0 h(x) = [ exp{ x2 }] η(θ) = µ σ 2 0 T (x) = x B(θ) = µ2 2σ 2 0 6
Examples Building Normal Distribution : X N(µ 0, σ 2 ). (Known mean) p(x θ) = e 1 (x µ 0 ) 1 2 2σ 2 2πσ 2 1 1 2πσ 2 2σ 2 2 = [ ] exp{ (x µ 0 ) 2 ) 1 log(σ 2 )} = h(x)exp{η(θ)t (x) B(θ) where: 1 h(x) = [ ] 2π 1 η(θ) = 2σ 2 T (x) = (x µ 0 ) 2 B(θ) = 1 log(σ 2 ) 2 7
Building Samples from One-Parameter Exponential Family Distribution 8 Consider a sample: X 1,..., X n, where X i are iid P where P P= {P θ, θ Θ} is a one-parameter exponential family distribution with density function p(x θ) = h(x)exp{η(θ)t (x) B(θ)} The sample X = (X 1,.. T., X n ) is a random vector with density/pmf: p(x θ) = n T i=1 (h(x i )exp[η(θ)t (x ) i B(θ)]) n n = [ i=1 h(x i )] exp[η(θ) i=1 T (x i ) nb(θ)] = h (x)exp{η (θ)t (x) B (θ)} where: T h (x) = n i=1 h(x i ) η (θ) = η(θ) ) T (x) = n i=1 T (x i ) B (θ) = nb(θ) Note: The Sufficient Statsitic T is one-dimensional for all n.
Building Samples from One-Parameter Exponential Family Distribution Theorem 1.6.1 Let {P θ } be a one-parameter exponential family of discrete distributions with pmf function: p(x θ) = h(x)exp{η(θ)t (x) B(θ)} Then the family of distributions of the statistic T (X ) is a one-parameter exponential family of discrete distributions whose frequency functions are P θ (T (x) = t) = p(t θ) = h (t)exp{η(θ)t B(θ)} where h (t) = h(x) {x:t (x)=t} Proof: Immediate 9
Canonical Exponential Family Building Re-parametrize setting η = η(θ) the Natural Parameter The density has the form p(x, η) = h(x)exp{ηt (x) A(η)} The function A(η) replaces B(θ) and is defined as the normalization constant: t t t log(a(η)) = h(x)exp{ηt (x)]dx if X continuous or log(a(η)) = h(x)exp{ηt (x)] x X if X discrete The Natural Parameter Space {η : η = η(θ), θ Θ} = E (Later, Theorem 1.6.3 gives properties of E) T (x) is the Natural Sufficient Statistic. 10
Building Canonical Representation of Poisson Family Poisson Distribution (1.6.1) : X Poisson(θ), where E[X ] = θ. θ p(x θ) = x e θ x!, x = 0, 1,... 1 = x! exp{(log(θ)x θ} = h(x)exp{η(θ)t (x) B(θ)} where: h(x) = 1 x! η(θ) = log(θ) T (x) = x Canonical Representation η = log(θ). A(η) = B(θ) = θ = e η. 11
Building MGFs of Canonical Exponenetial Family Models 12 Theorem 1.6.2 Suppose X is distributued according to a canonical exponential family, i.e., the density/pmf function is given by p(x η) = h(x)exp[ηt (x) A(η)], for x X R q. If η is an interior point of E, the natural parameter space, then The moment generating function of T (X ) exists and is given by M T (s) = E [e st (X ) η] = exp{a(s + η) A(η)} for s in some neighborhood of 0. E [T (X ) η] = A " (η). Var[T (X ) η] = A "" (η). Proof: t t M T (s) = E [e st (X ) ) η] = t t h(x)e (s+η)t (x) A(η) dx = [e [A(s+η) A(s)] ] h(x)e (s+η)t (x) A(s+η) dx = [e [A(s+η) A(s)] ] 1 Remainder follows from properties of MFGs.
Building Moments of Canonical Exponential Family Distributions Poisson Distribution: A(η) = B(θ) = θ = e. E (X θ) = A " (η) = e η = θ. Var(X θ) = A "" (η) = e η = θ. Binomial Distribution: ( r n p(x θ) = θ X (1 θ) x n x θ = h(x)exp{log( (1 θ))x + nlog(1 θ)} = h(x)exp{ηx nlog(e η + 1)} θ So A(η) = nlog(e η + 1, ) with η = log( 1 θ ) e A " η (η) = n = nθ e η +1 A "" 1 (η) η 1 = n e η + ne η e η +1 eη η e (e η +1) 2 eη = n[ e ] (1 ) η +1 (e η +1) = nθ(1 θ) 13
Moments of the Gamma Distribution Building X 1,..., X n i.i.d Gamma(p, λ) distribution with density xp 1e λx p(x λ, p) = λp Γ(p), 0 < x < where t Γ(p) = 0 λ p x p 1 e λx dx p(x λ, p) = [ xp 1 Γ(p) ]exp{ λx + plog(λ)} = h(x)exp{ηt (x) A(η)} where Thus η = λ A(η) = plog(λ) = plog( η) E (X ) = A " (η) = p/η = p/λ Var(X ) = A "" (η) = (p/η 2 ) = p/λ 2 14
Notes on Gamma Distribution Building Gamma(p = n/2, λ = 1/2) corresponds to the Chi-Squared distribution with n degrees of freedom. p = 2 corresponds to the Exponential Distribution For p = 1, Γ(1/2) = π Γ(p + 1) = pγ(p) for positive integer p. 15
Building Rayleigh Distribution Sample X 1..., X n iid with density function x p(x θ) = exp( x 2 /2θ 2 ) θ 2 1 = [x] exp{ x 2 log(θ 2 )} 2θ 2 = h(x)exp{ηt (x) A(η)} where 1 η = 2θ 2 T (X ) = X 2. A(η) = log(θ 2 ) = log( 1 ) = log( 2η) 2η By the mgf E (X 2 ) = A " (η) = 2 = 1 2η η = 2θ 2 Var(X 2 ) = A "" (η) = + 1 = 4θ 4 η 2 For the n sample: X = (X 1,..., X n ) ) n 1 i T (X) = X 2 E [T (X)] = n/η = 2nθ 2 1 Var[T (X)] = n η 2 = 4nθ 4. Note: P(X x) = 1 exp{ x2 } (Failure time model) 2θ 2 16
Outline Building 1 Building 17
Definition Building 18 {P θ, θ Θ}, Θ R k, is a k-parameter exponential family if the the density/pmf function of X P θ is k p(x θ) = h(x)exp[ η j (θ)t j (x) B(θ)], where x X R q, and j=1 η 1,..., η k and B are real-valued functions mappying Θ R. T 1,..., T k and h are real-valued functions mapping R q R. Note: By the Factorization Theorem (Theorem 1.5.1): T(X ) = (T 1 (X ),..., T k (X )) T is sufficient. For a sample X 1,..., X n iid P θ, the sample X = (X 1,..., X n ) has a distribution in the k-parameter exponential family with natural sufficient statistic n n T (n) = ( T 1 (X i ),..., T 1 (X n )) i=1 i=1
Building Examples Example 1.6.5. Normal Family P θ = N(µ, σ 2 ), with Θ = R R + = {(µ, σ 2 )} and density µ 1 p(x θ) = exp { σ 2σ x 2 2 σ 2 a k = 2 multiparameter exponential family (X = R 1, q = 1) and µ η 1 (θ) = and T 1 (X ) = X Note: σ 2 η 2 (θ) = 1 and T 2 (X ) = X 2 2σ 2 B(θ) = 1 ( µ2 + log(2πσ 2 )) 2 σ 2 h(x) = 1 2 x 2 1 ( µ2 + log(2πσ 2 ) )} For an n-sample X = ) (X 1,.. )., X n ) the natural suffficient n n statistic is T(X) = ( X i, 1, X 2 ) 1 i 19
Building Canonical k-parameter Exponential Family Corresponding to consider p(x θ) = h(x)exp[ k j=1 η j (θ)t j (x) B(θ)], Natural Parameter: η = (η 1,..., η k ) T Natural Sufficient Statistic: T(X) = (T 1 (X ),..., T k (X )) T Density function q(x η) = h(x)exp{t T (x)η) A(η)} where t t A(η) = log h(x)exp{t T (x)η}dx or A(η) = log[ h(x)exp{t T (x)η}] x X Natural Parameter space: E = {η R k : < A(η) < }. 20
Building Canonical Exponential Family Examples (k > 1) Example 1.6.5. Normal Family (continued) P θ = N(µ, σ 2 ), with Θ = R R + = {(µ, σ 2 )} and density µ 1 p(x θ) = exp { σ 2 x 2σ 2 x 2 1 ( µ 2 2 σ µ η 1 (θ) = σ 2 and T 1 (X ) = X η 2 (θ) = 1 2σ and T 2 (X ) = X 2 2 B(θ) = 1 ( µ2 2 + log(2πσ 2 )) and h(x) = 1 σ 2 Canonical Exponential Density: q(x η) = h(x)exp{t T (x)η A(η)} T T (x) = (x, x 2 ) = (T 1 (x), T 2 (x)) µ η = (η 1 1, η 2 ) T = ( σ 2, 2σ 2 ) 1 A(η) = 1 [ η2 + log(π 1 2 2η 2 η 2 )] E = R R = {(η 1, η 2 ) : A(η) exists} 2 + log(2πσ 2 ) )} 21
Building Canonical Exponential Family Examples (k > 1) Multinomial Distribution X = (X 1, X 2,..., X q ) Multinomial(n, θ = (θ 1, θ 2,..., θ q )) n p(x θ) = where Notes: x 1! x q! q is a given positive integer, ) q θ = (θ 1,..., θ q ) : 1 θ j = 1. n is a given positive integer ) q 1 X i = n. What is Θ? What is the dimensionality of Θ What is the Multinomial distribution when q = 2? 22
Example: Multinomial Distribution Building n θ x 1 θ x 2 x p(x θ) = q x θq 1! x q! 1 2 n = x 1! x q! exp{log(θ 1 )x 1 + + log(θ q 1 )x q 1 ) q 1 ) q 1 +log(1 1 θ j )[n 1 x j ]} ) = h(x)exp{ q 1 j=1 η j (θ)t j (x) B(θ)} where: n h(x) = x1! x q! η(θ) = (η 1 (θ), η 2 (θ),..., η q 1 (θ)) ) q 1 η j (θ) = log(θ j /(1 1 θ j )), j = 1,..., q 1 T (x) = (X 1, X 2,..., X q 1 ) = (T 1 (x), T 2 (x),..., T q 1 (x)). ) B(θ) = nlog(1 q 1 j=1 θ j ) For the canonical exponential density: ) A(η) = +nlog(1 + q 1 η j j=1 e ) 23
Outline Building 1 Building 24
Building Building Definition: Submodels Consider a k-parameter exponential family {q(x η); η E R k }. A Submodel is an exponential family defined by p(x θ) = q(x η(θ)) where θ Θ R k, k k, and η : Θ R k. Note: The submodel is specified by Θ. The natural parameters corresponding to Θ are a subset of the natural parameter space E = {η E : η = η(θ), θ Θ}. Example:X is a discrete r.v. s as X with X = {1, 2,..., k}, and X 1, X 2,..., X n are iid as X. Let P = set of distributions for X = (X 1,..., X n ), where the distribution of the X i is a member of any fixed collection of discrete distributions on X. Then P is exponential family (subset of Multinomial Distributions). 25
Building Building Models from Affine Transformations: Case I Consider P, the class of distributions for a r.v. X which is a canonical family generated by the natural sufficient statistic T(X ), a (k 1) vector-statistic, and h( ) : X R. A distribution in P has density/pmf function: p(x η) = h(x)exp{t T (x)η A(η)} where A(η) = log[ h(x)exp{t T (x)}] or x X A(η) = log[ h(x)exp{t T (x)}dx] X M: an affine tranformation form R k to R k defined by M(T) = MT + b, where M is k k and b is k 1, are known constants. 26
Building Building Models from Affine Transformations (continued) Consider P, the class of distributions for a r.v.x generated by the natural sufficient statistic M(T(X )) = MT(X ) + b Since the distribution of X has density/pmf: density/pmf function: p(x η) = h(x)exp{t T (x)η A(η)} we can write p(x η) = h(x)exp{[m(t(x)] T η A (η )} = h(x)exp{[mt(x) + b] T η A (η )} = h(x)exp{t T (x)[m T η ] + b T η A (η ) = h(x)exp{t T (x)[m T η ] A (η ) a subfamily pf P corresponding to Θ = {η : η E : η = M T η } Density constant for level sets of M(T(x)) 27
Building Building Models from Affine Transformations: Case II Consider P, the class of distributions for a r.v. X which is a canonical family generated by the natural sufficient statistic T(X ), a (k 1) vector-statistic, and h( ) : X R. A distribution in P has density/pmf function: p(x η) = h(x)exp{t T (x)η A(η)} For Θ R k, define η(θ) = Bθ E R k, where B is a constant k k matrix. The submodel of P is a submodel of the exponential family generated by B T T(X ) and h( ). 28
Models from Affine Transformations Building Logistic Regression. Y 1,..., Y n are independent Binomial(n i, λ i ), i = 1, 2,..., n Case 1: Unrestricted λ i : 0 < λ i < 1, i = 1,..., n n-parameter canonical exponential family Y i = {0, 1,..., n i } Natural sufficient statistic: T(Y 1,..., Y n ) = Y. ni ( r n h(y) = i 1({0 y i n i }) i=1 y i η i = log( λ i ) 1 λ i )n A(η) = i=1 n i log(1 + e η i ) p(y η) = h(y)exp{y T η A(η)} 29
Logistic Regression (continued) Building Case 2: For specified levels x 1 < x 2 < < x n assume η i (θ) = θ 1 + θ 2 x i, i = 1,..., n and θ = (θ 1, θ 2 ) T R 2. η(θ) = Bθ, where B is the n 2 matrix 1 x 1 B = [1, x] =.. 1 x n Set M = B T, this is the 2-parameter canonical exonential family generated ) by n ) MY = ( n i=1 Y i=1 x i Y i ) T i, and h(y) with n A(θ 1, θ 2 ) = n i log(1 + exp(θ 1 + θ 2 x i )). i=1 30
Logistic Regression (continued) Building Medical Experiment x i measures toxicity of drug n i number of animals subjected to toxicity level x i Y i = number of animals dying out of the n i when exposed to drug at level x i. Assumptions: Each animal has a random toxicity threshold X and death results iff drug level at or above x is applied. Independence of animals response to drug effects. Distribution of X is logistic P(X x) = [1 + exp( (θ 1 + θ 2 x))] 1 ( r P[X x] log = θ 1 + θ 2 x 1 P(X x) 31
Building Exponential Models Building Additional Topics Curved, e.g., Gaussian with Fixed Signal-to-Noise Ratio Location-Scale Regression Super models Exponential structure preserved under random (iid) sampling 32
MIT OpenCourseWare http://ocw.mit.edu 18.655 Mathematical Statistics Spring 2016 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.