Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we are told or implicity give the distributio of a radom variable X ad ask to compute certai quatities. I more advaced courses, we are asked to prove certai assertios about X. Exercise. Let p (0, ). Let X Ber(p), so that P(X = ) = p ad P(X = 0) = p. For which value of p is the variace maximized? Exercise 2. Suppose T is the umber of flips of a fair coi that are required for a head to appear. Fid ET. Exercise 3. Suppose that X = (X, X 2 ) are two cards dealt from a well-shuffled stadard 52 card deck. What is the probability that I get a pair? Exercise 4. Let X, X 2,... be a sequece of i.i.d. radom variables all with mea µ ad variace σ 2 <. Fid lim Xi 2. Although, the task may seem simple, problems that are easy to state may ot be so simple to solve. Let X, X 2,..., be a sequece of i.i.d. discrete radom variables that are uiformly distributed o {e,..., e }, where the e i are the stadard uit basis vectors i Z d so that e = (, 0,..., 0) ad e d = (0, 0,..., ). Let S = X + + X. The sequece (S ) is called a simple symmtric radom walk o Z d. Set T = if { : S = 0 Z d}. We say that that (S ) is recurret if P(T < ) =. Theorem 5 (Polya). The simple symmetric radom walk o Z d is recurret for dimesios d =, 2 ad ot recurret for all dimesios d 3. i=
2 INTRODUCTION TO MATHEMATICAL STATISTICS I statistics, we are iterested i the dual problem of fidig the distributio of X; we wat to kow whether a coi is actually fair or ot. Typically this ivovles observig may istaces of X; that is flippig the coi may times. Based o our observatios, we make a iferece about the distributio. 2. Radom samples ad poit estimators Suppose that f θ is a probability desity fuctio for each θ Θ, for a discrete of coituous radom variable. Here, Θ is usually a subset of R or R d. Let θ Θ. We say that X = (X,..., X ) is a radom sample from f θ if X,..., X are i.i.d. radom variables all with the same pdf f θ. I a typical statistics problem we do ot kow the value of θ, but we do kow that θ Θ ad we observe the values of a radom sample ad hope that it will tell us somethig about θ. If T = g(x) for some determiisitic fuctio, the we say that T is a statistic. If T does ot deped o θ, ad we use it to estimate θ, the we say that it is a poit estimator. If we observe the value T = t, the we say that t is a poit estimate. I udergraduate statistics course we were already itroduced to two importat poit estimators. Let X,..., X be a radom sample. The sample mea is give by X = X i. i= The sample variace is give by S 2 = (X i X) 2. i= A poit estimate T of θ is ubiased if ET = θ for all θ Θ. A sequece of poit estimates T = T are cosistet if T coverges i probability to θ, as, for all θ Θ. Whe covergece i probability is replaced with the stroger almost sure covergece, sometimes the termiology strogly cosistet is used. Exercise 6. Show that the sample mea ad variace are ubiased ad cosistet estimators for the true mea ad variace. Exercise 7. Let X,..., X be a radom sample from the Beroulli family; that is, X i Ber(p), where p (0, ). Let σ 2 be the variace of X. Cosider the poit estimate for σ 2 give by X( X). Is it ubiased? Is it cosistet?
INTRODUCTION TO MATHEMATICAL STATISTICS 3 Exercise 8. Suppose X,..., X is a radom sample, where X i U(0, θ), for some θ (0, ), so that X i are uiformly distributed i (0, θ). Cosider M := max {X,..., X } is a poit estimator for θ. Is it ubiased? Is it cosistet? 3. Sample Variace I this sectio, we will work out Exercise 6. We kow that E( X) = µ ad the cosistecy of X follows from the law of large umbers. Some algebra will give the result for S 2. First, set σ 2 := Var(X ) ad µ := EX. A short-cut formula gives that Var(Y ) := E(Y EY ) 2 = EY 2 (EY ) 2, from which if follows that EX 2 = σ 2 + µ 2. Secod, ote that (X i X) 2 = Xi 2 2X i X + X2 = Xi 2 2 X i X j 2 X2 i + X 2. j=,j i Third, takig expectatios, usig the idepedece of the X i, ad summig over i, gives ( )ES 2 = (σ 2 + µ 2 ) 2( )µ 2 2(σ 2 + µ 2 ) + E( X 2 ). Fially, recall that Var( X) = σ2, so that E( X 2 ) = σ 2 / + µ 2. Some algebra gives ( )ES 2 = ( )σ 2, so we are doe. To show cosistecy, first, recall the followig short-cut formula. Exercise 9. Let x,... x R ad x = / i= x i. Prove the socalled short-cut formula. (x i x) 2 = x 2 i ( ) 2. x i i= The Exercise 9 gives that ( )S 2 = i= i= Xi 2 ( ) 2. X i Fially, ote that the law of large umbers gives that Xi 2 EX 2 = σ 2 + µ 2 i= i= i=
4 INTRODUCTION TO MATHEMATICAL STATISTICS ad ( ) 2 X i µ 2 i= almost surely, as. So we are doe. 4. Method of momets The momets of a radom variable X are give by µ k := EX k. Uder certai coditios, if all the momets of X are fiite, the they determie the distributio for X. If X,..., X is a radom sample from f θ, the momet estimators are give by M k := Xi k. Exercise 0. Show that the momet estimators are ubiased ad cosistet estimators for µ k. If θ = (θ,..., θ 9 ), the it possible that if we compute at least 9 of momets, the we might be able to have equatios for each the θ i i terms of the momets, which we ca estimate usig the momet estimators. We call this method of fidig estimators the method of momets ad estimator obtaied usig this method are sometimes called the method of momet estimators. i= Exercise. Let X,..., X be a radom sample, where X i Uif(0, θ), where θ is ukow. Use the method of momets to fid a estimator for θ. Solutio. Note that EX = θ/2, so that θ = 2µ. Thus the method of momet estimator for θ is give by 2 X. Exercise 2. Suppose that X Bi(, p). Solve for ad p i terms of µ ad µ 2. Solutio. We have that µ = EX = p ad µ 2 = EX 2 = Var(X ) + (p) 2 = p( p) + (p) 2. Replacig p with µ / i the expressio for µ 2 gives µ 2 = µ µ 2 / + µ 2, from which it follows that = µ 2 /(µ + µ 2 µ 2 ). Exercise 3. Let X,..., X k be a radom sample, where X i Bi(, p), where both ad p are ukow. Use the method of momets to fid estimators for ad p.
INTRODUCTION TO MATHEMATICAL STATISTICS 5 Solutio. Notice that the short-cut formula gives that k (X i k X) 2 = k Xi 2 ( k ) 2. X i k k i= Thus from the previous exercise ad the short-cut formula, we have that the estimator for is give by N := ( X) 2 / ( X k k S2) ad so the estimator of p is give by X/N. Sometimes, rather tha cosiderig the secod momet i= µ 2 = EX 2 it is easier (ad equivalet) to cosider i= µ 2 = E(X µ ) 2 = Var(X ) = σ 2, which has a method of momet estimator give by S2 = (X i X) 2. i= Agai, this follows from the short-cut formula. Exercise 4. Let X,..., X be a radom sample, where X Gamma(α, β), where both α, β > 0 are ukow. Fid the method of momet estimators for α ad β. Solutio. Recall that the pdf X is give by f(x ; α, β) = β α Γ(α) xα e x/β [x > 0], ad µ = E(X ) = αβ ad σ 2 = Var(X ) = αβ 2. α = µ/β ad β 2 = σ 2 /α = σ 2 β/µ so that β = σ 2 /µ. Thus the method of momet estimators for β ad α are give by B = S2 X ad 2 ( X) A = S. 2
6 INTRODUCTION TO MATHEMATICAL STATISTICS Exercise 5. Let X,..., X be a radom sample, where X i Uif(a, b), where both a ad b are ukow. Fid the method of momet estimators for a ad b.