MIT Spring 2016

Similar documents
MIT Spring 2016

LECTURE 11: EXPONENTIAL FAMILY AND GENERALIZED LINEAR MODELS

ST5215: Advanced Statistical Theory

Lecture 1: August 28

15 Discrete Distributions

1. Fisher Information

Generalized Linear Models and Exponential Families

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Things to remember when learning probability distributions:

MIT Spring 2016

March 10, 2017 THE EXPONENTIAL CLASS OF DISTRIBUTIONS

Continuous Random Variables

LECTURE 2 NOTES. 1. Minimal sufficient statistics.

Introduction: exponential family, conjugacy, and sufficiency (9/2/13)

Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence

Generalized Linear Models (1/29/13)

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Maximum Likelihood Large Sample Theory

STAT 3610: Review of Probability Distributions

Brief Review of Probability

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

MIT Spring 2016

STAT215: Solutions for Homework 1

Foundations of Statistical Inference

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

Chapter 5. Chapter 5 sections

Exponential Families

ECE 275B Homework # 1 Solutions Version Winter 2015

discrete random variable: probability mass function continuous random variable: probability density function

Fundamentals of Statistics

CHAPTER 1 DISTRIBUTION THEORY 1 CHAPTER 1: DISTRIBUTION THEORY

1.1 Review of Probability Theory

Probability Distributions Columns (a) through (d)

Chapter 3. Exponential Families. 3.1 Regular Exponential Families

ECE 275B Homework # 1 Solutions Winter 2018

STAT Chapter 5 Continuous Distributions

Limiting Distributions

Random Variables and Their Distributions

STAT 460/ /561 STATISTICAL INFERENCE I & II 2017/2018, TERMs I & II

18.440: Lecture 28 Lectures Review

1 Probability Model. 1.1 Types of models to be discussed in the course

Sampling Distributions

STAT/MATH 395 PROBABILITY II

Mathematical Statistics 1 Math A 6330

ACM 116: Lectures 3 4

Mathematical statistics

Statistics 1B. Statistics 1B 1 (1 1)

Hypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses

Order Statistics. The order statistics of a set of random variables X 1, X 2,, X n are the same random variables arranged in increasing order.

Exercises and Answers to Chapter 1

1 Random Variable: Topics

BMIR Lecture Series on Probability and Statistics Fall 2015 Discrete RVs

Probability and Distributions

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Mathematical Statistics

HT Introduction. P(X i = x i ) = e λ λ x i

1 Probability Model. 1.1 Types of models to be discussed in the course

Severity Models - Special Families of Distributions

Brief Review on Estimation Theory

Mathematical statistics

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Classical Estimation Topics

Statistics 3858 : Maximum Likelihood Estimators

Machine learning - HT Maximum Likelihood

Test Problems for Probability Theory ,

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

3. Probability and Statistics

1 Probability and Random Variables

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Lecture 5: Moment generating functions

Stat410 Probability and Statistics II (F16)

Lecture 4: Exponential family of distributions and generalized linear model (GLM) (Draft: version 0.9.2)

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Statistics for scientists and engineers

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Uniformly and Restricted Most Powerful Bayesian Tests

Continuous Distributions

1.6 Families of Distributions

Formulas for probability theory and linear models SF2941

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Physics 403 Probability Distributions II: More Properties of PDFs and PMFs

When is MLE appropriate

ST5215: Advanced Statistical Theory

Multivariate Distributions

Review for the previous lecture

SDS 321: Introduction to Probability and Statistics

Continuous Distributions

4 Moment generating functions

Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic

t x 1 e t dt, and simplify the answer when possible (for example, when r is a positive even number). In particular, confirm that EX 4 = 3.

Chapter 4. Chapter 4 sections

Definition 1.1 (Parametric family of distributions) A parametric distribution is a set of distribution functions, each of which is determined by speci

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Write your Registration Number, Test Centre, Test Code and the Number of this booklet in the appropriate places on the answersheet.

MIT Spring 2016

Statistics 3657 : Moment Generating Functions

Probability and Statistics Notes

Transcription:

Dr. Kempthorne Spring 2016 1

Outline Building 1 Building 2

Definition Building Let X be a random variable/vector with sample space X R q and probability model P θ. The class of probability models P = {P θ, θ Θ} is a one-parameter exponential family if the density/pmf function p(x θ) can be written: p(x θ) = h(x)exp{η(θ)t (x) B(θ)} where h : X R η : Θ R B : Θ R. Note: By the Factorization Theorem, T (X ) is sufficient for θ p(x θ) = h(x)g(t (x), θ) Set g(t (x), θ) = exp{η(θ)t (x) B(θ)} T (X ) is the Natural Sufficient Statistic. 3

Examples Building Poisson Distribution (1.6.1) : X Poisson(θ), where E[X ] = θ. θ p(x θ) = x e θ x!, x = 0, 1,... 1 = x! exp{(log(θ)x θ} = h(x)exp{η(θ)t (x) B(θ) where: h(x) = 1 x! η(θ) = log(θ) T (x) = x 4

Examples Building Binomial Distribution (1.6.2) ( : X r Binomial(θ, n) n p(x θ) = θ 1 (1 θ) n x x = 0, 1,..., n ( x r n θ = exp{log( 1 θ )x + nlog(1 θ)} x = h(x)exp{η(θ)t (x) B(θ) where: ( r n h(x) = x θ η(θ) = log( 1 θ ) T (x) = x B(θ) = nlog(1 θ) 5

Examples Building Normal Distribution : X N(µ, σ 0 2 ). (Known variance) p(x θ) = e 2σ 0 2 where: 1 (x µ) 1 2 2πσ2 0 1 2 µ 2 2σ } 2πσ 2 0 2 σ0 2 2σ0 2 0 = [ exp{ x }] exp{ µ x = h(x)exp{η(θ)t (x) B(θ) 1 2πσ 2 2σ0 2 0 h(x) = [ exp{ x2 }] η(θ) = µ σ 2 0 T (x) = x B(θ) = µ2 2σ 2 0 6

Examples Building Normal Distribution : X N(µ 0, σ 2 ). (Known mean) p(x θ) = e 1 (x µ 0 ) 1 2 2σ 2 2πσ 2 1 1 2πσ 2 2σ 2 2 = [ ] exp{ (x µ 0 ) 2 ) 1 log(σ 2 )} = h(x)exp{η(θ)t (x) B(θ) where: 1 h(x) = [ ] 2π 1 η(θ) = 2σ 2 T (x) = (x µ 0 ) 2 B(θ) = 1 log(σ 2 ) 2 7

Building Samples from One-Parameter Exponential Family Distribution 8 Consider a sample: X 1,..., X n, where X i are iid P where P P= {P θ, θ Θ} is a one-parameter exponential family distribution with density function p(x θ) = h(x)exp{η(θ)t (x) B(θ)} The sample X = (X 1,.. T., X n ) is a random vector with density/pmf: p(x θ) = n T i=1 (h(x i )exp[η(θ)t (x ) i B(θ)]) n n = [ i=1 h(x i )] exp[η(θ) i=1 T (x i ) nb(θ)] = h (x)exp{η (θ)t (x) B (θ)} where: T h (x) = n i=1 h(x i ) η (θ) = η(θ) ) T (x) = n i=1 T (x i ) B (θ) = nb(θ) Note: The Sufficient Statsitic T is one-dimensional for all n.

Building Samples from One-Parameter Exponential Family Distribution Theorem 1.6.1 Let {P θ } be a one-parameter exponential family of discrete distributions with pmf function: p(x θ) = h(x)exp{η(θ)t (x) B(θ)} Then the family of distributions of the statistic T (X ) is a one-parameter exponential family of discrete distributions whose frequency functions are P θ (T (x) = t) = p(t θ) = h (t)exp{η(θ)t B(θ)} where h (t) = h(x) {x:t (x)=t} Proof: Immediate 9

Canonical Exponential Family Building Re-parametrize setting η = η(θ) the Natural Parameter The density has the form p(x, η) = h(x)exp{ηt (x) A(η)} The function A(η) replaces B(θ) and is defined as the normalization constant: t t t log(a(η)) = h(x)exp{ηt (x)]dx if X continuous or log(a(η)) = h(x)exp{ηt (x)] x X if X discrete The Natural Parameter Space {η : η = η(θ), θ Θ} = E (Later, Theorem 1.6.3 gives properties of E) T (x) is the Natural Sufficient Statistic. 10

Building Canonical Representation of Poisson Family Poisson Distribution (1.6.1) : X Poisson(θ), where E[X ] = θ. θ p(x θ) = x e θ x!, x = 0, 1,... 1 = x! exp{(log(θ)x θ} = h(x)exp{η(θ)t (x) B(θ)} where: h(x) = 1 x! η(θ) = log(θ) T (x) = x Canonical Representation η = log(θ). A(η) = B(θ) = θ = e η. 11

Building MGFs of Canonical Exponenetial Family Models 12 Theorem 1.6.2 Suppose X is distributued according to a canonical exponential family, i.e., the density/pmf function is given by p(x η) = h(x)exp[ηt (x) A(η)], for x X R q. If η is an interior point of E, the natural parameter space, then The moment generating function of T (X ) exists and is given by M T (s) = E [e st (X ) η] = exp{a(s + η) A(η)} for s in some neighborhood of 0. E [T (X ) η] = A " (η). Var[T (X ) η] = A "" (η). Proof: t t M T (s) = E [e st (X ) ) η] = t t h(x)e (s+η)t (x) A(η) dx = [e [A(s+η) A(s)] ] h(x)e (s+η)t (x) A(s+η) dx = [e [A(s+η) A(s)] ] 1 Remainder follows from properties of MFGs.

Building Moments of Canonical Exponential Family Distributions Poisson Distribution: A(η) = B(θ) = θ = e. E (X θ) = A " (η) = e η = θ. Var(X θ) = A "" (η) = e η = θ. Binomial Distribution: ( r n p(x θ) = θ X (1 θ) x n x θ = h(x)exp{log( (1 θ))x + nlog(1 θ)} = h(x)exp{ηx nlog(e η + 1)} θ So A(η) = nlog(e η + 1, ) with η = log( 1 θ ) e A " η (η) = n = nθ e η +1 A "" 1 (η) η 1 = n e η + ne η e η +1 eη η e (e η +1) 2 eη = n[ e ] (1 ) η +1 (e η +1) = nθ(1 θ) 13

Moments of the Gamma Distribution Building X 1,..., X n i.i.d Gamma(p, λ) distribution with density xp 1e λx p(x λ, p) = λp Γ(p), 0 < x < where t Γ(p) = 0 λ p x p 1 e λx dx p(x λ, p) = [ xp 1 Γ(p) ]exp{ λx + plog(λ)} = h(x)exp{ηt (x) A(η)} where Thus η = λ A(η) = plog(λ) = plog( η) E (X ) = A " (η) = p/η = p/λ Var(X ) = A "" (η) = (p/η 2 ) = p/λ 2 14

Notes on Gamma Distribution Building Gamma(p = n/2, λ = 1/2) corresponds to the Chi-Squared distribution with n degrees of freedom. p = 2 corresponds to the Exponential Distribution For p = 1, Γ(1/2) = π Γ(p + 1) = pγ(p) for positive integer p. 15

Building Rayleigh Distribution Sample X 1..., X n iid with density function x p(x θ) = exp( x 2 /2θ 2 ) θ 2 1 = [x] exp{ x 2 log(θ 2 )} 2θ 2 = h(x)exp{ηt (x) A(η)} where 1 η = 2θ 2 T (X ) = X 2. A(η) = log(θ 2 ) = log( 1 ) = log( 2η) 2η By the mgf E (X 2 ) = A " (η) = 2 = 1 2η η = 2θ 2 Var(X 2 ) = A "" (η) = + 1 = 4θ 4 η 2 For the n sample: X = (X 1,..., X n ) ) n 1 i T (X) = X 2 E [T (X)] = n/η = 2nθ 2 1 Var[T (X)] = n η 2 = 4nθ 4. Note: P(X x) = 1 exp{ x2 } (Failure time model) 2θ 2 16

Outline Building 1 Building 17

Definition Building 18 {P θ, θ Θ}, Θ R k, is a k-parameter exponential family if the the density/pmf function of X P θ is k p(x θ) = h(x)exp[ η j (θ)t j (x) B(θ)], where x X R q, and j=1 η 1,..., η k and B are real-valued functions mappying Θ R. T 1,..., T k and h are real-valued functions mapping R q R. Note: By the Factorization Theorem (Theorem 1.5.1): T(X ) = (T 1 (X ),..., T k (X )) T is sufficient. For a sample X 1,..., X n iid P θ, the sample X = (X 1,..., X n ) has a distribution in the k-parameter exponential family with natural sufficient statistic n n T (n) = ( T 1 (X i ),..., T 1 (X n )) i=1 i=1

Building Examples Example 1.6.5. Normal Family P θ = N(µ, σ 2 ), with Θ = R R + = {(µ, σ 2 )} and density µ 1 p(x θ) = exp { σ 2σ x 2 2 σ 2 a k = 2 multiparameter exponential family (X = R 1, q = 1) and µ η 1 (θ) = and T 1 (X ) = X Note: σ 2 η 2 (θ) = 1 and T 2 (X ) = X 2 2σ 2 B(θ) = 1 ( µ2 + log(2πσ 2 )) 2 σ 2 h(x) = 1 2 x 2 1 ( µ2 + log(2πσ 2 ) )} For an n-sample X = ) (X 1,.. )., X n ) the natural suffficient n n statistic is T(X) = ( X i, 1, X 2 ) 1 i 19

Building Canonical k-parameter Exponential Family Corresponding to consider p(x θ) = h(x)exp[ k j=1 η j (θ)t j (x) B(θ)], Natural Parameter: η = (η 1,..., η k ) T Natural Sufficient Statistic: T(X) = (T 1 (X ),..., T k (X )) T Density function q(x η) = h(x)exp{t T (x)η) A(η)} where t t A(η) = log h(x)exp{t T (x)η}dx or A(η) = log[ h(x)exp{t T (x)η}] x X Natural Parameter space: E = {η R k : < A(η) < }. 20

Building Canonical Exponential Family Examples (k > 1) Example 1.6.5. Normal Family (continued) P θ = N(µ, σ 2 ), with Θ = R R + = {(µ, σ 2 )} and density µ 1 p(x θ) = exp { σ 2 x 2σ 2 x 2 1 ( µ 2 2 σ µ η 1 (θ) = σ 2 and T 1 (X ) = X η 2 (θ) = 1 2σ and T 2 (X ) = X 2 2 B(θ) = 1 ( µ2 2 + log(2πσ 2 )) and h(x) = 1 σ 2 Canonical Exponential Density: q(x η) = h(x)exp{t T (x)η A(η)} T T (x) = (x, x 2 ) = (T 1 (x), T 2 (x)) µ η = (η 1 1, η 2 ) T = ( σ 2, 2σ 2 ) 1 A(η) = 1 [ η2 + log(π 1 2 2η 2 η 2 )] E = R R = {(η 1, η 2 ) : A(η) exists} 2 + log(2πσ 2 ) )} 21

Building Canonical Exponential Family Examples (k > 1) Multinomial Distribution X = (X 1, X 2,..., X q ) Multinomial(n, θ = (θ 1, θ 2,..., θ q )) n p(x θ) = where Notes: x 1! x q! q is a given positive integer, ) q θ = (θ 1,..., θ q ) : 1 θ j = 1. n is a given positive integer ) q 1 X i = n. What is Θ? What is the dimensionality of Θ What is the Multinomial distribution when q = 2? 22

Example: Multinomial Distribution Building n θ x 1 θ x 2 x p(x θ) = q x θq 1! x q! 1 2 n = x 1! x q! exp{log(θ 1 )x 1 + + log(θ q 1 )x q 1 ) q 1 ) q 1 +log(1 1 θ j )[n 1 x j ]} ) = h(x)exp{ q 1 j=1 η j (θ)t j (x) B(θ)} where: n h(x) = x1! x q! η(θ) = (η 1 (θ), η 2 (θ),..., η q 1 (θ)) ) q 1 η j (θ) = log(θ j /(1 1 θ j )), j = 1,..., q 1 T (x) = (X 1, X 2,..., X q 1 ) = (T 1 (x), T 2 (x),..., T q 1 (x)). ) B(θ) = nlog(1 q 1 j=1 θ j ) For the canonical exponential density: ) A(η) = +nlog(1 + q 1 η j j=1 e ) 23

Outline Building 1 Building 24

Building Building Definition: Submodels Consider a k-parameter exponential family {q(x η); η E R k }. A Submodel is an exponential family defined by p(x θ) = q(x η(θ)) where θ Θ R k, k k, and η : Θ R k. Note: The submodel is specified by Θ. The natural parameters corresponding to Θ are a subset of the natural parameter space E = {η E : η = η(θ), θ Θ}. Example:X is a discrete r.v. s as X with X = {1, 2,..., k}, and X 1, X 2,..., X n are iid as X. Let P = set of distributions for X = (X 1,..., X n ), where the distribution of the X i is a member of any fixed collection of discrete distributions on X. Then P is exponential family (subset of Multinomial Distributions). 25

Building Building Models from Affine Transformations: Case I Consider P, the class of distributions for a r.v. X which is a canonical family generated by the natural sufficient statistic T(X ), a (k 1) vector-statistic, and h( ) : X R. A distribution in P has density/pmf function: p(x η) = h(x)exp{t T (x)η A(η)} where A(η) = log[ h(x)exp{t T (x)}] or x X A(η) = log[ h(x)exp{t T (x)}dx] X M: an affine tranformation form R k to R k defined by M(T) = MT + b, where M is k k and b is k 1, are known constants. 26

Building Building Models from Affine Transformations (continued) Consider P, the class of distributions for a r.v.x generated by the natural sufficient statistic M(T(X )) = MT(X ) + b Since the distribution of X has density/pmf: density/pmf function: p(x η) = h(x)exp{t T (x)η A(η)} we can write p(x η) = h(x)exp{[m(t(x)] T η A (η )} = h(x)exp{[mt(x) + b] T η A (η )} = h(x)exp{t T (x)[m T η ] + b T η A (η ) = h(x)exp{t T (x)[m T η ] A (η ) a subfamily pf P corresponding to Θ = {η : η E : η = M T η } Density constant for level sets of M(T(x)) 27

Building Building Models from Affine Transformations: Case II Consider P, the class of distributions for a r.v. X which is a canonical family generated by the natural sufficient statistic T(X ), a (k 1) vector-statistic, and h( ) : X R. A distribution in P has density/pmf function: p(x η) = h(x)exp{t T (x)η A(η)} For Θ R k, define η(θ) = Bθ E R k, where B is a constant k k matrix. The submodel of P is a submodel of the exponential family generated by B T T(X ) and h( ). 28

Models from Affine Transformations Building Logistic Regression. Y 1,..., Y n are independent Binomial(n i, λ i ), i = 1, 2,..., n Case 1: Unrestricted λ i : 0 < λ i < 1, i = 1,..., n n-parameter canonical exponential family Y i = {0, 1,..., n i } Natural sufficient statistic: T(Y 1,..., Y n ) = Y. ni ( r n h(y) = i 1({0 y i n i }) i=1 y i η i = log( λ i ) 1 λ i )n A(η) = i=1 n i log(1 + e η i ) p(y η) = h(y)exp{y T η A(η)} 29

Logistic Regression (continued) Building Case 2: For specified levels x 1 < x 2 < < x n assume η i (θ) = θ 1 + θ 2 x i, i = 1,..., n and θ = (θ 1, θ 2 ) T R 2. η(θ) = Bθ, where B is the n 2 matrix 1 x 1 B = [1, x] =.. 1 x n Set M = B T, this is the 2-parameter canonical exonential family generated ) by n ) MY = ( n i=1 Y i=1 x i Y i ) T i, and h(y) with n A(θ 1, θ 2 ) = n i log(1 + exp(θ 1 + θ 2 x i )). i=1 30

Logistic Regression (continued) Building Medical Experiment x i measures toxicity of drug n i number of animals subjected to toxicity level x i Y i = number of animals dying out of the n i when exposed to drug at level x i. Assumptions: Each animal has a random toxicity threshold X and death results iff drug level at or above x is applied. Independence of animals response to drug effects. Distribution of X is logistic P(X x) = [1 + exp( (θ 1 + θ 2 x))] 1 ( r P[X x] log = θ 1 + θ 2 x 1 P(X x) 31

Building Exponential Models Building Additional Topics Curved, e.g., Gaussian with Fixed Signal-to-Noise Ratio Location-Scale Regression Super models Exponential structure preserved under random (iid) sampling 32

MIT OpenCourseWare http://ocw.mit.edu 18.655 Mathematical Statistics Spring 2016 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.