Foundaions of Saisical Inference Julien Beresycki Lecure 2 - Sufficiency, Facorizaion, Minimal sufficiency Deparmen of Saisics Universiy of Oxford MT 2016 Julien Beresycki (Universiy of Oxford BS2a MT 2016 1 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 2 / 57 Sufficien saisics Le X 1,..., X n be a random sample from f (x; θ. Definiion (Sufficiency A saisic T (X 1,..., X n is a funcion of he daa ha does no depend on unknown parameers. A saisic T (X 1,..., X n is said o be sufficien for θ if he condiional disribuion of X 1,..., X n, given T, does no depend on θ. Tha is, f (x, θ f (x Commen The definiion says ha a sufficien saisic T conains all he informaion here is in he sample abou θ. Definiion (Sufficiency A saisic T (X 1,..., X n is a funcion of he daa ha does no depend on unknown parameers. A saisic T (X 1,..., X n is said o be sufficien for θ if he condiional disribuion of X 1,..., X n, given T, does no depend on θ. Tha is, f (x, θ f (x Wha does his even mean? I means ha for any funcion g he map is consan. θ E θ [g(x T ] Julien Beresycki (Universiy of Oxford BS2a MT 2016 3 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 4 / 57
Example 7 n independen rials where he probabiliy of success is p. Le X 1,..., X n be indicaor variables which are 1 or 0 depending if he rial is a success or failure. Le T n i1 X i. The condiional disribuion of X 1,..., X n given T is g(x 1,..., x n, p f (x 1,..., x n, p h( p n i1 px i (1 p 1 x i p (1 p n p (1 p n p (1 p n ( n 1, 1 : Facorizaion Crierion T (X 1,..., X n is a sufficien saisic for θ if and only if here exis wo non-negaive funcions K 1, K 2 such ha he likelihood funcion L(θ; x can be wrien L(θ; x K 1 [(x 1,..., x n ; θ]k 2 [x 1,..., x n ] K 1 [; θ]k 2 [x], where K 1 depends only on he sample hrough T, and K 2 does no depend on θ. no depending on p, so T is sufficien for p. Commen Makes sense, since no informaion in he order. Julien Beresycki (Universiy of Oxford BS2a MT 2016 5 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 6 / 57 Proof - for discree random variables 1. Assume ha T is sufficien, hen he disribuion of he sample is L(θ; x f (x θ f (x, θ f (x, θh( θ T is sufficien which implies f (x, θ f (x h( θ depends on x hrough (x only so L(θ; x f (x h( θ We se L(θ; x K 1 K 2, where K 1 h, K 2 f. 2. Suppose L(θ; x f (x θ K 1 [; θ]k 2 [x].then h( θ f (x, θ L(θ; x Thus f (x, θ {x:t (x} f (x, θ h( θ {x:t (x} K 1 [; θ] L(θ; x h( θ {x:t (x} K 2 (x. K 2 [x] {x:t (x} K 2(x, no depending on θ. (K 1 cancels ou in numeraor and denominaor. Julien Beresycki (Universiy of Oxford BS2a MT 2016 7 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 8 / 57
Minimal sufficiency How much can we reduce he daa wihou loosing informaion? Is here a minimal sufficien saisic? Example 7 (con. Consider n 3 Bernoulli rials 1 T 1 (X (X 1, X 2, X 3 (he individual sequences of rials 2 T 2 (X (X 1, 3 i1 X i (he 1s random variable and he oal sum. 3 T 3 (X 3 i1 X i (he oal sum 4 T 4 (X I(T 3 (X 0 (I is indicaor funcion; Exercise Prove T 4 no sufficien Definiion (Minimaliy A saisic is minimal sufficien if i can be expressed as a funcion of every oher sufficien saisic. Example 7 (con. : Minimal sufficiency n Bernoulli rials wih T n i1 X i. Suppose T above is no minimal sufficien bu anoher saisic U is MS.Then U can be given as a funcion of T (and no vis versa or T is MS and here exis 1 2 values of T so ha U( 1 U( 2 (ie T U is many o one so U T is no a funcion, and we assume for he momen no oher make U( U( 1.The even U u is he even T { 1, 2 }. Le x 1,...x n conain 1 successes. Then g(x 1,..., x n u, p g(x 1,..., x n 1, pp( 1 u, p g(x 1,..., x n 1 P(T 1 T { 1, 2 }, p ( ( 1 n n 1 p 1 (1 p n 1 ( 1 1 p 1(1 p n 1 + n 2 p 2(1 p n 2 which depends on p, so U is no sufficien, a conradicion, and hence T mus be MS (similar reasoning for muliple i. Julien Beresycki (Universiy of Oxford BS2a MT 2016 9 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 10 / 57 Minimal sufficiency and pariions of he sample space Inuiively, a minimal sufficien saisic mos efficienly capures all possible informaion abou he parameer θ. Any saisic T (X pariions he sample space ino subses and in each subse T (X has consan value. Minimal sufficien saisics correspond o he coarses possible pariion of he sample space. In he example of n 3 Bernoulli rials consider he following 4 saisics and he pariions hey induce. T 1 (X ( X 1, X 2, X 3 3 T 2 (X X 1, X i i1 3 T 3 (X X i T 4 (X I T 3 (X 0 i1 ( Julien Beresycki (Universiy of Oxford BS2a MT 2016 11 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 12 / 57
Lemma 1 : Lehmann-Scheffé pariions Proof (for discree RVs Consider he pariion of he sample space defined by puing x and y ino he same class of he pariion if and only if L(θ; y/l(θ; x f (y θ/f (x θ m(x, y. Then any saisic corresponding o his pariion is minimal sufficien. Commen This Lemma ells us how o define pariions ha correspond o minimal sufficien saisics. I says ha raios of likelihoods of wo values x and y in he same pariion (and hence same saisic value should no depend on θ. 1. Sufficiency. Suppose T is such a saisic g(x, θ f (x θ f ( θ f (x θ τ {y : T (y } f (y θ, y τ f (x θ y τ f (x θm(x, y [ ] 1 m(x, y y τ which does no depend on θ. Hence he pariion D is sufficien. Julien Beresycki (Universiy of Oxford BS2a MT 2016 13 / 57 Julien Beresycki (Universiy of Oxford BS2a MT 2016 14 / 57 2. Minimal sufficiency. Now suppose U is any oher sufficien saisic and ha U(x U(y for some pair of values (x, y. If we can show ha U(x U(y implies T (x T (y, hen he Lehmann-Scheffé pariion induced by T includes he pariion based on any oher sufficien saisic.in oher words, T is a funcion of every oher sufficien saisic, and so mus be minimal sufficien. Since U is sufficien we have L(θ; y L(θ; x K 1[u(y; θ]k 2 [y] K 1 [u(x; θ]k 2 [x] K 2[y] K 2 [x] which does no depend on θ. So he saisic U produces a pariion a leas as fine as ha induced by T, and he resul is proved. Julien Beresycki (Universiy of Oxford BS2a MT 2016 15 / 57 Sufficiency in an exponenial family For a sample X 1,..., X n i.i.d. from a full-rank k-parameer exponenial family i holds ha The saisic T (x; i1 B 1(x i,..., n i1 B k(x i is minimal sufficien. The disribuion of T (x belongs o a k-parameer exponenial family. n n k L(θ; x f (x i ; θ exp A j (θb j (x i + C(x i + D(θ i1 i1 j1 ( k n n exp A j (θ B j (x i + nd(θ + C(x i. j1 i1 Exponenial family form again. Julien Beresycki (Universiy of Oxford BS2a MT 2016 16 / 57 i1
Sufficiency in an exponenial family Suppose he family is in canonical form so φ j A j (θ, and le j n i1 B j(x i, C(x n i1 C(x i. k L(θ; x exp φ j j + nd(θ + C(x. j1 By he facorizaion crierion 1,..., k are sufficien saisics for φ 1,..., φ k. In fac, we do no need canonical form. If k L(θ; x exp A j (θ j + nd(θ + C(x j1 is a minimal k-dimensional linear exponenial family hen (by he regulariy condiions above 1,..., k are minimal sufficien for θ 1,..., θ k. Minimal sufficiency is verified using Lemma 1. Julien Beresycki (Universiy of Oxford BS2a MT 2016 17 / 57