Topic 9: Sampling Distributions of Estimators

Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0

Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be derived from the joit distributio of X 1... X. It is called the samplig distributio because it is based o the joit distributio of the radom sample. Give a samplig distributio, we ca calculate the probability that a estimator will ot differ from the parameter θ by more tha a specified amout obtai iterval estimates rather tha poit estimates after we have a sample- a iterval estimate is a radom iterval such that the true parameter lies withi this iterval with a give probability (say 95%). choose betwee to estimators- we ca, for istace, calculate the mea-squared error of the estimator, E θ [( ˆθ θ) 2 ] usig the distributio of ˆθ. Samplig distributios of estimators deped o sample size, ad we wat to kow exactly how the distributio chages as we chage this size so that we ca make the right trade-offs betwee cost ad accuracy. Page 1

Samplig distributios: sample size ad precisio Examples: 1. What if X i N(θ, 4), ad we wat E( X θ) 2.1? This is simply the variace of X, ad we kow X N(θ, 4/). 4.1 if 40 2. Cosider a radom sample of size from a Uiform distributio o [0, θ], ad the statistic U = max{x 1,..., X }. The CDF of U is give by: 0 if u 0 ( ) F(X) = uθ if 0 < u < θ 1 if u θ We ca ow use this to see how large our sample must be if we wat a certai level of precisio i our estimate for θ. Suppose we wat the probability that our estimate lies withi.1θ for ay level of θ to be bigger tha 0.95: Pr( U θ.1θ) = Pr(θ U.1θ) = Pr(U.9θ) = 1 F(.9θ) = 1 0.9 We wat this to be bigger tha 0.95, or 0.9 0.05. With the LHS decreasig i, we choose log(.05) log(.9) = 28.43. Our miimum sample size is therefore 29. Page 2

Joit distributio of sample mea ad sample variace For a radom sample from a ormal distributio, we kow that the M.L.E.s are the sample mea ad the sample variace 1 (X i X ) 2. Also, X N(µ, σ2 ) ( X i µ )2 χ 2 (sice it is the sum of squares of stadard ormal radom variables). σ If we replace the populatio mea µ with the sample mea X, the resultig sum of squares, has a χ 2 1 distributio. Theorem: If X 1,... X form a radom sample from a ormal distributio with mea µ ad variace σ 2, the the sample mea X ad the sample variace 1 (X i X ) 2 are idepedet radom variables ad X N(µ, σ2 ) (X i X ) 2 σ 2 χ 2 1 Note: This is oly for ormal samples. Work through the applicatio of this theorem o p. 475 of your textbook, where you are asked to compute the probability that the sample mea ad sample stadard deviatio of a sample draw from a N(µ, σ 2 ) are withi.2σ of their populatio values. Page 3

The t-distributio Let Z N(0, 1), let Y χ 2 v, ad let Z ad Y be idepedet radom variables. The X = Z Yv t v The p.d.f of the t-distributio is give by: f(x; v) = v+1 Γ( 2 ) (1 Γ( v 2 ) + x2 ) ( v+1 2 ) πv v Features of the t-distributio: Oe ca see from the above desity fuctio that the t-desity is symmetric with a maximum value at x = 0. The shape of the desity is similar to that of the stadard ormal (bell-shaped) but with fatter tails. Page 4

Relatio to radom ormal samples RESULT 1: Defie S 2 = (X i X ) 2 The radom variable U = (X µ) S 2 1 t 1 (X µ) σ Proof: We kow that N(0, 1) ad that S2 σ 2 χ 2 1. Dividig the first radom variable by the square root of the secod, divided by its degrees of freedom, the σ i the umerator ad deomiator cacels to obtai U. Implicatio: We caot make statemets about X µ usig the ormal distributio if σ 2 is ukow. This result allows us to use its estimate ˆσ 2 = (X i X ) 2 / sice (X µ) ˆσ/ 1 t 1 RESULT 2 Give X, Z, Y, as above. As X d Z N(0, 1) To see why: ad 1 U ca be writte as is close to 1. 1 (X µ) ˆσ t 1. As gets large ˆσ gets very close to σ F 1 (.55) =.129 for t 10,.127 for t 20 ad.126 for the stadard ormal distributio. The differeces betwee these values icreases for higher values of their distributio fuctios (why?) Page 5

Cofidece itervals for the mea Give σ 2, let us see how we ca obtai a iterval estimate for µ, i.e. a iterval which is likely to cotai µ with a pre-specified probability. ( ) Sice (X µ) σ/ N(0, 1), Pr 2 < (X µ) σ/ < 2 =.955 But this evet is equivalet to the evets 2σ < X µ < 2σ ad X 2σ < µ < X + 2σ With kow σ, each of the radom variables X 2σ ad X + 2σ are statistics. Therefore, we have derived a radom iterval withi which the populatio parameter lies with probability.955, i.e. ( Pr X 2σ < µ < X + 2σ ) =.955 = γ Notice that there are may itervals for the same γ, this is the shortest oe. Now, give our sample, our statistics take particular values ad the resultig iterval either cotais or does ot cotai µ. We ca therefore o loger talk about the probability that it cotais µ because the experimet has already bee performed. We say that (x 2σ < µ < x + 2σ ) is a 95.5% cofidece iterval for µ. Alteratively, we may say that µ lies i the above iterval with cofidece γ or that the above iterval is a cofidece iterval for µ with cofidece coefficiet γ Page 6

Cofidece Itervals for meas..examples Example 1: X 1,..., X forms a radom sample from a ormal distributio with ukow µ ad σ 2 = 10. x is foud to be 7.164 with = 40. A 80% cofidece iterval for the mea µ 10 is give by (7.164 1.282 40 ), 7.164 + 1.282 10 40 ) or (6.523, 7.805). The cofidece coefficiet. is.8 Example 2: Let X deote the sample mea of a radom sample of size 25 from a σ distributio with variace 100 ad mea µ. I this case, = 2 ad, makig use of the cetral limit theorem the followig statemet is approximately true: ( Pr 1.96 < (X µ) 2 ) ( ) < 1.96 =.95 or Pr X 3.92 < µ < X + 3.92 =.95 If the sample mea is give by x = 67.53, a approximate 95% cofidece iterval for the sample mea is give by (63.61, 71.45). Example 3: Suppose we are iterested i a cofidece iterval for the mea of a ormal distributio but do ot kow σ 2. We kow that (X µ) ˆσ/ 1 t 1 ad ca use the t-distributio with ( 1) degrees of freedom to costruct our iterval estimate. With = 10, x = 3.22, ˆσ = 1.17, a 95% cofidece iterval is give by (3.22 (2.262)(1.17)/ 9, 3.22 + (2.262)(1.17)/ 9) = (2.34, 4.10) (display ivt(9,.975) gives you 2.262) Page 7

Cofidece Itervals for differeces i meas Let X 1,..., X ad Y 1,..., Y m deote idepedet radom samples from two distributios, N(µ 1, σ 2 ) ad N(µ 2, σ 2 ), with sample meas deoted by X, Ȳ ad sample variaces by ˆσ 2 1 ad ˆσ2 2. We ve established that: X ad Ȳ are ormally ad idepedetly distributed with meas µ 1 ad µ 2 ad variaces σ2 ad σ2 m Usig our results o the distributio of liear combiatios of ormally distributed variables, we kow that X Ȳm is ormally distributed with mea µ 1 µ 2 ad variace σ 2 + σ2 m. The radom variable ( X Ȳm) (µ 1 µ 2 ) has a stadard ormal distributio ad will σ 2 + σ2 m form the umerator of the T radom variable that we are goig to use. We also kow that ˆσ2 1 σ 2 ad m ˆσ2 2 σ 2 have χ 2 distributios with ( 1) ad (m 1) degrees of freedom respectively, so their sum ( ˆσ 2 1 + m ˆσ2 2 )/σ2 has a χ 2 distributio with ( + m 2) degrees of freedom ad the radom variable ˆσ 2 1 +m ˆσ2 2 σ 2 (+m 2) ca appear as the deomiator of a radom variable which has a t distributio with ( + m 2) degrees of freedom. Page 8

Cofidece Itervals for differeces i meas..cotd We have therefore established that X = ( X Ȳm) (µ 1 µ 2 ( ) has a t-distributio with ˆσ 2 1 +m ˆσ2 2 (+m 2) 1 + 1 m ( + m 2) degrees of freedom. To simplify otatio, deote the deomiator of the above expressio by R. Give our samples, X 1,..., X ad Y 1,..., Y m, we ca ow costruct cofidece itervals for differeces i the meas of the correspodig populatios, µ 1 µ 2. We do this i the usual way: Suppose we wat a 95% cofidece iterval for the differece i the meas, we fid a umber b such that, usig the t-distributio with ( + m 2) degrees of freedom, ( ) Pr b < X < b =.95 The radom iterval ( X Ȳ) br, ( X Ȳ) + br will ow cotai the true differece i meas with 95% probability. A cofidece iterval is ow based o sample values, ( x ȳ m ) ad correspodig sample variaces. Based o the CLT, we ca use the same procedure eve whe our samples are ot ormal. Page 9