Lecture 23: Minimal sufficiency

Lecture 23: Miimal sufficiecy Maximal reductio without loss of iformatio There are may sufficiet statistics for a give problem. I fact, X (the whole data set) is sufficiet. If T is a sufficiet statistic ad T = ψ(s), where ψ is a fuctio ad S is aother statistic, the S is sufficiet. For istace, if X 1,...,X are iid with P(X i = 1) = θ ad P(X i = 0) = 1 θ, the ( m X i, i=m+1 X i) is sufficiet for θ, where m is ay fixed iteger betwee 1 ad. If T is sufficiet ad T = ψ(s) with a measurable fuctio ψ that is ot oe-to-oe, the T is more useful tha S, sice T provides a further reductio of the data without loss of iformatio. Is there a sufficiet statistics that provides maximal" reductio of the data? Defiitio 6.2.11 (miimal sufficiecy) A sufficiet statistic T (X) is a miimal sufficiet statistic if, for ay other sufficiet statistic U(X), T (x) is a fuctio of U(x). UW-Madiso (Statistics) Stat 609 Lecture 23 2015 1 / 15

T (x) is a fuctio of U(x) iff U(x) = U(y) implies that T (x) = T (y) for ay pair (x,y). I terms of the partitios of the rage of the data set, if T (x) is a fuctio of U(x), the {x : U(x) = t} {x : T (x) = t} Thus, a miimal sufficiet statistic achieves the greatest possible data reductio as a sufficiet statistic. If both T ad S are miimal sufficiet statistics, the by defiitio there is a oe-to-oe fuctio ψ such that T = ψ(s); hece, the miimal sufficiet statistic is uique i the sese that two statistics that are fuctios of each other ca be treated as oe statistic. For example, if T is miimal sufficiet, the so is (T,e T ), but o oe is goig to use (T,e T ). If the rage of X is R k, the there exists a miimal sufficiet statistic. Example 6.2.15. Let X 1,...,X be iid form the uiform(θ,θ + 1) distributio, where θ R is a ukow parameter. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 2 / 15

The joit pdf of (X 1,...,X ) is I({θ < x i < θ + 1}) = I({x () 1 < θ < x (1) }), (x 1,...,x ) R, where x (i) deotes the ith ordered value of x 1,...,x. By the factorizatio theorem, T = (X (1),X () ) is sufficiet for θ. Note that x (1) = sup{θ : f θ (x) > 0} ad x () = 1 + if{θ : f θ (x) > 0}. If S(X) is a statistic sufficiet for θ, the by the factorizatio theorem, there are fuctios h ad g θ such that f θ (x) = g θ (S(x))h(x). For x with h(x) > 0, x (1) = sup{θ : g θ (S(x)) > 0} ad x () = 1 + if{θ : g θ (S(x)) > 0}. Hece, there is a fuctio ψ such that T (x) = ψ(s(x)) whe h(x) > 0. Sice P(h(X) > 0) = 1, we coclude that T is miimal sufficiet. I this example, the dimesio of a miimal sufficiet statistic is larger tha the dimesio of θ. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 3 / 15

Fidig a miimal sufficiet statistic by defiitio is ot coveiet. The ext theorem is a useful tool. Theorem 6.2.13. Let f θ (x) be the pmf or pdf of X. Suppose that T (x) is sufficiet for θ ad that, for every pair x ad y with at least oe of f θ (x) ad f θ (y) is ot 0, f θ (x)/f θ (y) does ot deped o θ implies T (x) = T (y). (Here, we defie a/0 = for ay a > 0.) The T (X) is miimal sufficiet for θ. Proof. Let U(X) be aother sufficiet statistic. By the factorizatio theorem, there are fuctios h ad g θ such that f θ (x) = g θ (U(x))h(x) for all x ad θ. For x ad y such that at least oe of f θ (x) ad f θ (y) is ot 0, i.e., at least oe of h(x) ad h(y) is ot 0, if U(x) = U(y), the which does ot deped o θ. f θ (x) f θ (y) = g θ (U(x))h(x) g θ (U(y))h(y) = h(x) h(y) UW-Madiso (Statistics) Stat 609 Lecture 23 2015 4 / 15

By the assumptio of the theorem, T (x) = T (y). This shows that there is a fuctio ψ such that T (x) = ψ(s(x)) Hece, T is miimal sufficiet for θ Example 6.2.15. We re-visit this example ad show how to apply Theorem 6.2.13. If X 1,...,X are iid form the uiform(θ,θ + 1) distributio, the the joit pdf is f θ (x) = I({x () 1 < θ < x (1) }), x = (x 1,...,x ) R, Let x ad y be two data poits. Each of f θ (x) ad f θ (y) oly takes two possible values, 0 ad 1. Let A x = (x () 1,x (1) ) ad A y = (y () 1,y (1) ). The f θ (x) 0 θ A x,θ A y f θ (y) = 1 θ A x,θ A y θ A x,θ A y This depeds o θ uless A x = A y. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 5 / 15

Thus, if the ratio does ot deped o θ, we must have x (1) = y (1) ad x () = y (). Therefore, by Theorem 6.2.13, (X (1),X () ) is miimal sufficiet for θ. Example 6.2.14. Let X 1,...,X be iid from N(µ,σ 2 ) with ukow µ R ad σ > 0. Earlier, we showed that T = ( X,S 2 ) is sufficiet for θ = (µ,σ 2 ). Is T miimal sufficiet? Let f θ (x) be the joit pdf of the sample, x ad y be two sample poits. The f θ (x) f θ (y) = (2πσ 2 ) /2 exp( [(( x µ) 2 + ( 1)s 2 x]/(2σ 2 )) (2πσ 2 ) /2 exp( [((ȳ µ) 2 + ( 1)s 2 y]/(2σ 2 )) = exp([ ( x 2 ȳ 2 ) + 2µ( x ȳ) ( 1)(s 2 x s 2 y)]/(2σ 2 )) where x ad s 2 x are the sample mea ad variace based o the sample poit x, ad ȳ ad s 2 y are the sample mea ad variace based o the sample poit y. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 6 / 15

This ratio does ot deped o θ = (µ,σ 2 ) iff x = ȳ ad s 2 x = s 2 y. By Theorem 6.2.13, T = ( X,S 2 ) is miimal sufficiet for θ = (µ,σ 2 ). Like the cocept of sufficiecy, miimal sufficiecy also depeds o the parameters we are iterested i. I Example 6.2.14, let us cosider two sub-families. Suppose that it is kow that σ = 1. The T = ( X,S 2 ) is still sufficiet but ot miimal sufficiet for µ. I fact, usig Theorem 6.2.13 we ca show that X is miimal sufficiet for µ whe σ = 1 ad, T = ( X,S 2 ) is ot a fuctio of X (why?). Suppose that it is kow that µ 2 = σ 2 so that we oly have oe ukow parameter µ R, µ 0. (This is called a curved expoetial family.) Is T = ( X,S 2 ) miimal sufficiet for θ = µ? Note that T is certaily sufficiet for θ. The aswer ca be foud as a special case of the followig result. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 7 / 15

Miimal sufficiet statistics i expoetial families Let X 1,...,X be iid from a expoetial family with pdf f θ (x) = h(x)c(θ)exp ( η(θ) T (x) ), θ Θ where η(θ) = (w 1 (θ),...,w k (θ)) ad T (x) = (t 1 (x),...,t k (x)). Suppose that there exists {θ 0,θ 1,...,θ k } Θ such that the vectors η i = η(θ i ) η(θ 0 ), i = 1,...,k, are liearly idepedet i R k. (This is true if the family is of full rak). We have show that T (X) is sufficiet for θ. We ow show that T is i fact miimal sufficiet for θ. Note that we ca focus o the pits x at which h(x) > 0. For ay θ, f θ (x) f θ (y) = exp( η(θ) [T (x) T (y)] ) h(x) h(y) If this ratio equals φ(x,y) ot depedig o θ, the which implies η(θ) [T (x) T (y)] = log(φ(x,y)) + log(h(y)/h(x)) [η(θ i ) η(θ 0 )] [T (x) T (y)] = 0 i = 1,...,k. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 8 / 15

Sice η(θ i ) η(θ 0 ), i = 1,...,k, are liearly idepedet, we must have T (x) = T (y). By Theorem 6.2.13, T is miimal sufficiet for θ. Applicatio to curved ormal family We ca ow show that T = ( X,S 2 ) miimal sufficiet for θ = µ whe the sample is a radom sample from N(µ, µ 2 ), µ R, µ 0. It ca be show that ( µ η(θ) = σ 2, 1 ) ( 2σ 2 = µ, 1 ) 2µ 2 Poits θ 0 = (1,1), θ 1 = ( 1,1), ad θ 2 = (1/2,1/2) are i the parameter space, ad ( ) ( ) ( 2 η(θ 1 ) η(θ 0 ) = = ( 1)/2 ( 1)/2 0 ( η(θ 2 ) η(θ 0 ) = /2 2( 1) ) ( ( 1)/2 ) ( = ) ) /2 3( 1)/2 UW-Madiso (Statistics) Stat 609 Lecture 23 2015 9 / 15

Are these two vectors liearly idepedet? If there are c ad d such that i.e., c[η(θ 1 ) η(θ 0 )] + d[η(θ 2 ) η(θ 0 )] = 0 ( 2 c 0 ) ( + d /2 3( 1)/2 ) = 0 The, we must have d = 0 ad the c = 0. That meas vectors η(θ 1 ) η(θ 0 ) ad η(θ 2 ) η(θ 0 ) are liearly idepedet ad the coditio for the result of expoetial family is satisfied. Therefore, T = ( X,S 2 ) is miimal sufficiet for θ = µ. Example. Let (X 1,...,X ) be a radom sample from Cauchy(µ,σ), where µ R ad σ > 0 are ukow parameters. The joit pdf of (X 1,...,X ) is f µ,σ (x) = σ 1 π σ 2 + (x i µ) 2, x = (x 1,...,x ) R. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 10 / 15

For ay x = (x 1,...,x ) ad y = (y 1,...,y ), if f µ,σ (x) = ψ(x,y)f µ,σ (y) holds for ay µ ad σ, where ψ does ot deped o (µ,σ), the [ 1 + (y i µ) 2] [ = ψ(x,y) 1 + (x i µ) 2] µ R Both sides of the above idetity are polyomials of degree 2 i µ. Compariso of the coefficiets to the highest terms gives ψ(x,y) = 1 ad hece [ 1 + (y i µ) 2] = [ 1 + (x i µ) 2] µ R As a polyomial of µ, the left-had side of the above idetity has 2 complex roots x i ± ı, i = 1,...,, while the right-had side of the above idetity has 2 complex roots y i ± ı, i = 1,...,. Sice the two sets of roots must agree, the ordered values of x i s are the same as the ordered values of y i s. By Theorem 6.2.13, the order statistics of X 1,...,X is miimal sufficiet for (µ,σ). UW-Madiso (Statistics) Stat 609 Lecture 23 2015 11 / 15

The followig result may be useful i establishig miimal sufficiecy outside of the expoetial family. Theorem 6.6.5. Let f 0,f 1,f 2,... be a sequece of pdf s or pmf s o the rage of X, all have the same support. a. The statistic ( f1 (X) T (X) = f 0 (X), f ) 2(X) f 0 (X),... is miimal sufficiet for the family {f 0,f 1,f 2,...} of populatios for X. b. If P is a family of pdf s or pmf s for X with commo support, ad (i) f i P, i = 0,1,2,... (ii) T (X) is sufficiet for P, the T is miimal sufficiet for P. Proof of part a. Let g i (T ) = T i, the ith compoet of T, i = 0,1,2,..., ad let g 0 (T ) = 1. The f i (x) = g i (T (x))f 0 (x), i = 0,1,2,... By the factorizatio theorem, T is sufficiet for P 0 = {f 0,f 1,f 2,...}. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 12 / 15

Suppose that S(X) is aother sufficiet statistic. By the factorizatio theorem, there are Borel fuctios h ad g i such that f i (x) = g i (S(x))h(x), i = 0,1,2,... The T i (x) = f i(x) f 0 (x) = g i(s(x)) g 0 (S(x)), i = 0,1,2,... Thus, T is a fuctio of S. By defiitio, T is miimal sufficiet for P 0. Proof of part b. The two additioal coditios are P 0 P ad T is sufficiet for P. Let S be ay sufficiet statistic for P. The S is also sufficiet for P 0. From part a, T is miimal sufficiet for P 0 so there is a fuctio ψ such that T = ψ(s). By defiitio, T is miimal sufficiet for P sice T is sufficiet for P. UW-Madiso (Statistics) Stat 609 Lecture 23 2015 13 / 15

I fact, the result i Theorem 6.6.5 still holds if the commo support coditio is replaced by that the support of f 0 cotais the support of ay pdf or pmf i P. I most applicatios, we ca choose P 0 cotaiig fiitely may elemets. Example Let X 1,...,X be a radom sample from double-expoetial(µ,1), µ R, i.e., the joit pdf of X is ) f µ (x) = 1 2 exp ( x i µ Cosider P 0 = {f 0,f 1,...,f }, where each f j is the pdf with µ = j, j = 0,1,...,. By Theorem 6.6.5a, a miimal sufficiet statistic for P 0 is ( ) ) T = exp( X i X i j, j = 1,..., which is equivalet to UW-Madiso (Statistics) Stat 609 Lecture 23 2015 14 / 15

( U = X i X i j, j = 1,..., We ca further show that U is equivalet to S, the set of order statistics. Sice S is sufficiet, by Theorem 6.6.5b, S, U, ad T are all miimal sufficiet. Defiitio 6.6.3 (Necessity). A statistic is said to be ecessary if it ca be writte as a fuctio of every sufficiet statistic. T is ecessary if, for every sufficiet S, there is a fuctio ψ such that T = ψ(s). From the defiitios of the ecessity ad miimal sufficiecy, we ca reach the followig coclusio. Theorem 6.6.4. A statistic is miimal sufficiet if ad oly if it is ecessary ad sufficiet. ) UW-Madiso (Statistics) Stat 609 Lecture 23 2015 15 / 15