Lecture 7: Properties of Random Samples

Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ <, the a EX = µ, b V arx = σ, c ES = σ. Proof. Part (a of the theorem ca be simply proved as follows : ( ( 1 EX = E X i = 1 E X i = 1 EX 1 = µ. (1 A similar proof ca be give for part(b : ( ( 1 V arx = V ar X i = 1 V ar X i From the defiitio of sample variace ad usig the equatio, ( 1S = i [] = 1 V arx 1 = σ. ( (X i X = i [] X i X, (3 part (c ca be proved as follows: ( [ ] ES 1 = E Xi X, 1 = 1 1 (EX 1 EX, = 1 ( ( σ (σ + µ 1 + µ, = σ. (4 1

Theorem 1.. Let X 1, X,...X be a radom sample from a pmf or pdf f(x θ, where, ( f(x θ = h(xc(θ exp w i (θu i is a member of a expoetial family. Defie statistics T 1, T,...T as, T i (X 1, X...X = t i (X j, i = 1,... j=1 If the set {w 1 (θ, w (θ,...w (θ : θ Θ} cotais a ope subset of R, the the distributio of (T 1,...T is a expoetial family of the form, ( f T (u 1,..., u θ = H(u 1,...u [c(θ] exp w i (θu i Example 1.3 (Sum of Beroulli Radom Variables. Let X 1, X,...X be radom sample of size from a Beroulli distributio. Thus, P (X 1,...X p = Ber(p, = P (X 1 p = p X 1 (1 p 1 X 1, ( [ ] p = (1 pexp log 1 p X 1. (5 Comparig with the expoetial family equatio above, we get h(x 1 = 1, c(p = 1 p ad w 1 (p = log( p 1 p. Samplig from Normal distributio Theorem.1. Let X 1,...X be a radom sample from a Normal distributio N (µ, σ ad X ad S are sample mea ad variace respectively. The, a X ad S are idepedet radom variables. b X N (µ, σ. c ( 1S has a chi-squared distributio with ( 1 degrees of freedom. Proof. a Without ay loss of geerality, we ca assume that µ = 0 ad σ = 1. It ca be show that if X 1 ad X be two idepedet radom variables, the U 1 = g 1 (X 1 ad U = g (X are also idepedet radom variables

where g 1 ad g are fuctios of X 1 ad X respectively. Thus we aim to show that X ad S are fuctios of idepedet radom vectors. We ca write S as a fuctio of ( 1 deviatios as follows: S = 1 1 = 1 1 = 1 1 (X i X ( (X 1 X + (X i X i= [ (X i X] + i= (X i X (6 The last statemet follows from the fact that (X i X = 0. Hece, S ca be writte as a fuctio of oly the ( 1 deviatios (X X, X 3 X,..., X X. We ca show that these radom variables are idepedet of X ad hece prove statemet (a. The joit pdf of the sample X 1, X,..., X is give by f(x 1,..., x = 1 (π exp [ 1 We mae the followig trasformatio, x i y 1 = x, ] y = x x,. i= < x i <, i [] (7 y = x x. (8 This liear trasformatio has a Jacobia of ad the distributio [ ] [ ] f(y 1,..., y = exp 1 (π (y 1 y i exp 1 (y i + y 1, < y i <, i= i= ( [ ] ( 1/ y = exp 1 1/ exp π (π( 1/ 1 yi + y i. Hece, the joit pdf factors ad thus the radom variables Y 1,..., Y are idepedet. 3 i= i= (9

b Cosider a radom sample X 1,..., X obtaied from N (µ, σ. The momet geeratig fuctio (mgf of X i, i [] is Hece, for the variable X i,the mgf is give by M Xi (t = exp (µt + σ t. (10 M X i (t = exp (µ t + σ t. (11 Now, or the sample mea X = (X 1+X + +X, the mgf is give by M Xi (t = [exp (µ t ] + σ t, = exp ((µ t + σ t, = exp (µt + σ t. (1 Because the mgf of a distributio is uique to that distributio, this mgf is from a Normal Distributio with mea µ ad variace σ. Hece, X N (µ, σ. The chi-squared pdf is a special case of the gamma pdf ad is give as, f(x = 1 Γ(p/ p/ x(p/ 1 e x/, 0 < x <. (13 Some properties of the chi squared distributio with p degrees of freedom are summarized i the followig lemma. Lemma.. Let χ p deote a chi squared radom variable with p degrees of freedom, the, (a If Z N (0, 1, the Z χ 1, i.e., the square of a stadard ormal radom variable is a chi squared radom variable. (b If X 1, X..., X are idepedet ad X i χ p i, the X i X p i. Thus, idepedet chi squared variables add to a chi squared variable ad their degrees of freedom also add up. c To prove part (c, first we prove the recursive relatios for sample mea ad variace. We ow that, sample mea X +1 = 1 +1 X +1. We obtai the 4 =1

recursive relatios for sample mea as follows, X +1 = 1 +1 X, + 1 =1 = 1 + 1 [X +1 + X ], = 1 + 1 [X +1 + X ]. Hece the recursive relatio for sample mea ca be stated as, X +1 = 1 + 1 [X +1 + X ]. (14 Now we will proceed to derive the recursive relatioship for sample variace. For + 1, radom samples, the sample variace ca be stated as, Usig (14, we have, +1 S+1 = =1 +1 = =1 +1 = =1 +1 =1 +1 S+1 = [X X +1 ] (15 =1 [X 1 + 1 [X +1 + X ]], [X 1 + 1 [X +1 + ( + 1 1X ]], [X X 1 + 1 [X +1 X ]], = [(X X 1 + ( + 1 [X +1 X ] 1 + 1 [X +1 X ][X X ]]. =1 Sice (X i X = 0, we have, (16 +1 S+1 = (X X + 1 + 1 [X +1 X ] 1 + 1 [X +1 X ], =1 [ = (X X + 1 1 ] [X +1 X ], + 1 =1 = (X X + + 1 [X +1 X ]. (17 =1 5

Thus we have, S +1 = ( 1S + + 1 [X +1 X ]. (18 Replacig by 1 i (18, we get a recursive relatio for sample variace as, ( 1S = ( S 1 + 1 [X X 1 ]. (19 If we tae = ad use it i (19 ad if we defie 0 S1 = 0, the from (19, we have S = 1(X X 1 1.Sice the distributio of (X X 1 is Gaussia with parameter (0,1, part (a of lemma. shows that S χ 1. Proceedig with iductio, let us assume that for =, ( 1S χ 1. So for = + 1, we ca write from (18, S +1 = ( 1S + + 1 [X +1 X ]. (0 By iductive hypothesis, ( 1S χ 1, so if we ca establish that [ ] X+1 X +1 χ 1 ad is idepedet of S, the from part (b of lemma., S+1 χ ad the theorem will be proved. The vector (X +1, X is idepedet of S, so is ay fuctio of this vector. Furthermore, (X +1 X is a ormally distributed radom variable with mea 0 ad variace, [ ] X+1 X +1 χ 1. This completes our proof of the the- ad therefore orem. V ar(x +1 X = + 1. 3 Order Statistics Defiitio 3.1. The order statistics of a radom sample X 1, X,... X are the sample values placed i ascedig order. They are deoted by X (1, X (,... X (. The order statistics are radom variables satisfyig X (1 X (. I 6

particular, X (1 = mi 1 i X i, ( X ( = secod smallest X i, mi 1 i,x i X (1 X i (1. X ( = max 1 i X i. Theorem 3.. Let f X be the probability desity fuctio associated with the populatio, the the joit desity of order statistics ca be writte as,! f X (x i, if x 1 < x... < x, f X(1,X (,...X ( (x 1, x,... x = ( 0, otherwise. Remar 1. The term! comes ito this formula, because for ay set of values x 1, x... x, there are! equally liely assigmets for these values to X 1, X,... X that all yields the same values of the order statistics. Defiitio 3.3. The sample rage, R = X ( X (1 is the distace betwee the smallest ad the largest observatios. It is a measure of the dispersio of the sample ad should reflect the dispersio i the populatio. Defiitio 3.4. The sample media, which we will deote by M, is a umber such that approximately oe half of the observatios are less tha M ad oe half are greater. I terms of order statistics, M ca be defied as, { X (+1/ if is odd, M = (3 (X / + X (/+1 /, if is eve. Defiitio 3.5. For ay umber p betwee 0 ad 1, the (100pth percetile is the observatio such that approximately p of the observatios are less tha this observatio ad (1 p are greater tha it. As a special case, for p =.5, we have the 50th sample percetile, which is othig but the sample media. Theorem 3.6. Let X 1, X,... X be a radom sample from a discrete distributio with pmf f X (x i = p i where x 1 < x... are the possible values of X i ascedig 7

order. We defie, P 0 = 0, P 1 = p 1, P = p 1 + p, (4. P i = p 1 + p... + p i,. Let X (1, X (,... X ( be the order statistics from the sample. The, P (X (j x i = =j Pi (1 P i, (5 ad P (X (j = x i = =j [Pi (1 P i P i 1(1 P i 1 ]. (6 Proof. First we fix i. Let Y be a radom variable which couts the umber of X 1, X..., X which are less tha of equal to x i. For each of X 1, X..., X, we deote the evet {X j x i } as success ad the evet {X j > x i } as failure. So Y ca be regarded as the umber of successes i trials. Sice X 1, X..., X are idetically distributed, the probability of success for each trial is a same value, which is P i. We ca write P i as, P i = P [X j x i ]. (7 The success or failure of the j th trial is idepedet of the outcome of ay other trial, sice X j is idepedet of other X i s. Thus we ca write Y Bi(, P i. The evet {X j x i } is equivalet to the evet Y j; that is, atleast j of the sample values are less tha or equal to x i. Sice Y follows a Biomial distributio, we ca write, P (Y j = =j ( As P (Y j = P (X (j x i, we ca write, P (X (j x i = =j P i (1 P i. (8 Pi (1 P i. (9 8

This completes the proof of (5. For the proof of (6, we ote that, Hece, we ca write usig (9, P (X (j = x i = P (X (j = x i = P (X (j x i P (X (j x i 1. =j ( [P i (1 P i P i 1(1 P i 1 ]. (30 This completes our proof. Here, for the case i = 1, P (X (j = x i = P (X (j x i. The defiitio of P 0 = 0,taes care of this situatio. Theorem 3.7. Let X 1, X,... X deote the order statistics of a radom sample, X 1, X,... X with cdf F x (x ad pdf f X (x. The the pdf of of X j is, f X(j (x =! (j 1!( j! f X(xF X (x j 1 [1 F X (x] j. (31 Proof. We will first fid the cdf of X (j ad the will differetiate it to get the pdf. As i theorem 3.6, let Y be a radom variable which couts the umber of X 1, X,... X which are less tha or equal to x. The, if we cosider the evet X j x as success, the followig the approach for the proof of 3.6, we ca write that Y Bi(, F X (x. It is to be oted that although X 1, X,... X are cotiuous radom variables, Y is discrete. Hece, we have, P (Y j = F X (x (1 F X (x. (3 =j Sice P (Y j = P (X j x i = F X( j(x, we will differetiate (3 to obtai the pdf of X (j. Thus, f X(j (x = d(f X (j (x. dx After differetiatig the above expressio, it ca be writte as, [F X (x 1 (1 F X (x f X (x F X (x ( (1 F X (x 1 f X (x] =j = jf X (x j 1 (1 F X (x j f X (x+ F X (x 1 (1 F X (x f X (x, j 1 =j ( =j+1 F X (x ( (1 F X (x 1 f X (x, 9

! = (j 1!( j! f X(xF X (x j 1 [1 F X (x] j 1 + (p + 1F X (x p (1 F X (x p 1 f X (x p + 1 p=j 1 =j ( F X (x ( (1 F X (x 1 f X (x. The 1 st equality was obtaied from the fact that the secod term uder the summatio will be zero whe = ad the d equality followed, whe we mae the trasformatio p = 1. Thus,! f X(j (x = (j 1!( j! f X(xF X (x j 1 [1 F X (x] j 1 + (p + 1F X (x p (1 F X (x p 1 f X (x p + 1 p=j 1 =j ( Now we utilize the followig results, (p + 1 = p + 1 F X (x ( (1 F X (x 1 f X (x. (33! ( p 1!p!, ad ( =! ( 1!!. Usig these above results, we ca write (33 as, f X(j (x = This completes our proof of the theorem.! (j 1!( j! f X(xF X (x j 1 [1 F X (x] j. (34 10