Chapter 3. Point Estimation. 3.1 Introduction
|
|
- Valentine Wood
- 5 years ago
- Views:
Transcription
1 Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e. γ : (Θ, B Θ ) (γ(θ), B γ ) 3.1 Introduction Def An estimator T is a measurable function T : (X, B X ) (γ(θ), B γ ). Of course, it is hoped that T (X) will tend to be close to the unknown estimand γ(θ), but this requirement is not part of a formal definition of an estimator. Desirable properties of an estimator are: Unbiasedness Consistency (strong, weak, in r th mean) Sufficiency Asymptotic Normality Minimal Sufficiency, Completeness, Invariance,... 41
2 42 CHAPTER 3. POINT ESTIMATION In the sequel we are interested in unbiased estimators and we shall lern about a further statistical criterion: efficiency. Def Let γ : Θ IR m be measurable. (a) A statistic T : (X, B X ) (IR m, B m ) is called unbiased, if E θ (T ) = γ(θ) θ Θ. (b) Each function γ on Θ, for which there exists an unbiased estimator, is called an estimable function. (c) For a biased estimator b(γ(θ), T ) := E θ (T ) γ(θ) is called the bias. (d) An estimator T is called asymptotically unbiased for γ(θ) if lim b(γ(θ), T n) = 0. n Def An estimator T is called median unbiased for γ(θ), if med θ (T ) = γ(θ) θ Θ. If T is unbiased for γ(θ), then in general g(t ) is biased for γ(θ), unless g is linear. Unbiased estimators do not always exist. Unbiased estimators are not always reasonable.
3 3.2. MINIMUM VARIANCE UNBIASED ESTIMATORS Minimum Variance Unbiased Estimators In the sequel the case Θ IR is considered. Def Let T be the set of all unbiased estimators T of θ with E θ (T 2 ) < for all θ Θ and let T θ0 be the set of all unbiased estimators T of θ 0 with E θ0 (T 2 ) < (a) T 0 T θ0 is called locally minimum variance unbiased estimator (LMVUE) in θ 0, if E θ0 [(T 0 θ 0 ) 2 ] E θ0 [(T θ 0 ) 2 ] for all T T θ0. (b) T T is called uniformly minimum variance unbiased estimator (UMVUE), if E θ [(T θ) 2 ] E θ [(T θ) 2 ] for all T T and θ Θ. Other names are: (locally) best unbiased estimator and in the case of a linear estimator (locally) best linear unbiased estimator (BLUE). Theorem Let T be as in Def , T =, and let T (0) be the set of all unbiased estimators of the zero, i.e. T (0) = {T 0 E θ (T 0 ) = 0, E θ (T 2 ) < θ Θ}. Then it holds that T T is UMVUE if and only if E θ (T 0 T ) = 0 for all θ Θ and T 0 T (0). Proof: According to the above assumption E θ [T 0 T ] exists for all θ Θ and T 0 T (0). Necessity: Suppose T T is UMVUE and there exists a θ 0 Θ and a T 0 T (0) such that E θ0 [T 0 T ] 0. Then T + λt 0 T for all λ IR. In case E θ0 [T0 2 ] = 0
4 44 CHAPTER 3. POINT ESTIMATION E θ0 [T 0 T ] = 0 (Schwarz inequality). Let hence E θ0 [T0 2 ] > 0 and choose λ 0 = E θ0 [T 0 T ]/E θ0 [T0 2 ]. Then for T + λt 0 = T T 0 E θ0 [T 0 T ]/E θ0 [T0 2 ] it holds that E θ0 [(T + λt 0 ) 2 ] = E θ0 [T 2 ] Eθ 2 0 [T 0 T ]/E θ0 (T0 2 ) < E θ0 (T 2 ) or V ar θ0 [T + λ 0 T 0 ] < V ar θ0 [T ] (contradiction!). Sufficiency: Suppose E θ [T 0 T ] = 0 holds for a T T and let T T. Then T T T (0) and from the above condition it follows that E θ [T (T T )] = 0 for all θ Θ, which entails E θ [T 2 ] = E θ [T T ] E θ [T 2 ] 1/2 E θ [(T ) 2 ] 1/2. For E θ [(T ) 2 ] = 0 there is nothing to prove. For E θ [(T ) 2 ] > 0 it follows that E θ [(T ) 2 ] E θ (T 2 ) for all θ Θ, hence V ar θ [T ] V ar θ [T ]for all θ Θ and T T. Theorem Let T. Then there exists at most one UMVUE. Proof: Let T and T be both UMVUE s. Then T T T (0), hence E θ [T ( T T )] = 0 or E θ [T T ] = Eθ [(T ) 2 ] or Cov θ (T, T ) = V ar θ (T ) = V ar θ ( T ), from which Corr(T, T ) = 1 follows for all θ Θ. Therefore there exist a, b IR with P θ (a T + b T = 0) = 1 for all θ Θ. Since E θ (a T + b T ) = (a + b)θ for all θ it follows that P θ (T = T ) = 1 for all θ Θ. Theorem (Rao-Blackwell) Let P = {P θ θ Θ}, T T and let S be sufficient for P. Then (a) E θ [(T S)] is independent of θ and an unbiased estimator for θ for all θ Θ and
5 3.2. MINIMUM VARIANCE UNBIASED ESTIMATORS 45 (b) E θ [(E(T S) θ) 2 ] E θ [(T θ) 2 ] if P θ (T = E(T S)) = 1 θ Θ. θ Θ. Equality holds if and only Proof: The independence from θ follows from the independence of the conditional distributions P X S=s and the unbiasedness from E θ [E(T S)] = E θ [T ] = θ. Therefore it is sufficient to show that E θ [E(T S) 2 ] E θ [T 2 ] for all θ Θ. Now E θ [T 2 ] = E θ [T 2 S]. Hence we have to show that [E(T S) 2 E[E(T 2 S)] holds P θ - a. e. for all θ Θ. But this follows from Schwarz s inequality (add E[1/S]). Equality holds in (b) if and only if i.e. E θ [E(T S) 2 ] = E θ (T 2 ), E θ [E[T 2 S] E 2 [T S]] = 0 which is equivalent to E θ [V ar(t S)] = 0 E[T 2 S] = E 2 [T S] P θ -a.e. T = E[T S]P θ -a.e. for all θ Θ. Theorem (Lehmann-Scheffé) If S is a complete sufficient statistic and if T T, then there exists an UMVUE, and it is given by E(T S). Proof: For T 1, T 2 T E θ [E(T 1 S) E(T 2 S)] = 0 holds for all θ Θ. Since S is complete E[T 1 S] = E[T 2 S] holds P θ -a.e., and this is the UMVUE according to Theorem Remark: (a) According to Rao & Blackwell s Theorem one should look to find unbiased functions of a sufficient statistic. If this sufficient statistic is complete, then this function is the UMVUE. (b) UMVUE s may exist, even if there does not exist a sufficient statistic.
6 46 CHAPTER 3. POINT ESTIMATION Theorem (Cramér-Rao-Fréchét) Let P = {P θ θ Θ} with µ densities (µ = # or µ = λ) and let Θ be an open interval in IR 1. {x f θ (x) = 0} be independent of θ Θ. For every θ let f θ (x)/ be defined. Suppose that (i) fθ dµ = f θdµ = 0 θ Θ. (ii) Let γ : Θ IR be differentiable on Θ, and let T be an unbiased estimator for γ(θ) such that E θ (T 2 ) < for all θ Θ. Let further T (x)f(x; θ)µ(dx) = T (x) f θ(x)µ(dx) θ Θ. Then (a) [ ] 2 [γ (θ)] 2 E θ [(T γ(θ)) 2 ] E θ log f(x; θ) θ Θ. For any θ 0 Θ, either γ (θ 0 ) = 0 and equality holds in (a) for θ = θ 0, or (b) Var θ0 (T ) = E θ [(T γ(θ)) 2 ] [γ (θ)] 2 E θ ( [ log f(x;θ) ] 2 ). If, in the latter case, equality holds in (b) and if T is not a constant, then there exists a real number K θ0 0 such that (c) T (x) γ(θ) = K θ0 log f(x; θ 0 ) µ a.e.
7 3.2. MINIMUM VARIANCE UNBIASED ESTIMATORS 47 Remarks: The function log f(x; θ 0 )/ is also called score function and [ ] 2 ( ) E θ log f(x; θ) log f(x; θ) = Var θ is called the Fisher Information I(θ). For γ(θ) = θ γ (θ) = 1, of course. Proof: Differentiating both sides of f(x; θ)µ(dx) = 1 leads (with (i)) to f(x; θ)µ(dx) = 0, or on {f > 0} to f(x; θ) {f>0} {f>0} f(x; θ) f(x; θ)µ(dx) = 0 logf(x; θ) f(x; θ)µ(dx) = 0, leading to E θ [ logf(x; θ) According to assumption (ii) we have γ(θ) = T (x)f(x; θ)µ(dx), γ (θ) = E θ [T (X) which entails logf(x; θ) ], ( ) logf(x; θ) E θ [T (X) γ(θ)] = γ (θ) and (a) follows from Schwarz inequality. ] = 0. For (b) it is sufficient to consider either the case γ (θ 0 ) 0 or the case, where in (a) the < -sign holds for a θ 0. In both cases for the Fisher- Information I (θ 0 ) > 0 holds, which entails (b).
8 48 CHAPTER 3. POINT ESTIMATION If in (b) the = -sign holds, then γ (θ 0 ) 0 must hold. Then according to Schwarz (in)equality there exists a real number K θ0, such that T (X) γ(θ 0 ) = K θ0 logf(x; θ 0 ) holds µ-a.e. Let for the vector case, Θ IR P, γ(θ) be a convex subset of IR k. Then f(x; θ)/ is a p vector, ( ) ( ) T I(θ) = E log f(x; θ) log f(x; θ) is a p p matrix, γ(θ) is a k p matrix and Var θ (T ) = E θ [ (T (X) γ(θ)) (T (X) γ(θ)) ] is a k k matrix. With the corresponding regularity conditions of Theorem one can easily show the corresponding inequality (d) Var θ (T ) ( ) ( ) T γ(θ) γ(θ) I(θ) 1, where the sign is to be understood as the difference between the left and the right hand side being a positive semidefinite matrix. For a proof of Theorem in the multiparameter case we refer to Lehmann/Casella (2001), pp Theorem 3.2.6: In the above case let p = k and assume that the k k matrix (γ) be regular for all θ Θ, and let f/ be continuous for all θ and x. Then in (d) the equality sign holds if and only if there are functions C(θ), Q 1 (θ),..., Q k (θ) and H(x), such that dp k = f(x; θ) = C(θ) exp Q dµ j (θ)t j (x) H(x), j=1
9 3.2. MINIMUM VARIANCE UNBIASED ESTIMATORS 49 and with Q(θ) = (Q 1 (θ),..., Q k (θ)) it holds that [( ) ] 1 [ ] Q lnc γ(θ) =. Proof: 1. Let f and γ have the above form. Then we show that in the CRinequality the = - sign holds. From f as above we have with c(θ) = exp {D(θ)} logf(x; θ) = Q (θ)t ((X) + D (θ), where Q (θ) = ( ) f(x;θ i ) j i, j = 1,..., k, [ ] and since E θ = 0, logf(x;θ) 0 = E θ [Q (θ)t (X) + D (θ)] = D (θ) + Q (θ)e θ [T (X)], weobtainfordet (Q (θ)) 0 (which may E θ [T (X)] = Q (θ) 1 D (θ). Hence the estimator ˆγ(θ) = T (X) is unbiased for γ(θ). Since T (X) γ(θ) = T (X) + Q (θ) 1 D (θ), we get, by putting K(θ) = Q (θ) 1, that K θ logf(x; θ) = T (X) + Q (θ) 1 D(θ), i. e. the equality sign holds in the CR-inequality. 2. From the CR-equality the above representation of f and γ follows. If equality holds, then there exists a regular (k x k)-matrix K θ such that T (X) γ(θ) = K θ logf(x; θ) µ a.e.
10 50 CHAPTER 3. POINT ESTIMATION or K 1 θ [T (X) γ(θ)] = logf(x; θ). We integrate both sides with resprect to θ, where we put D(θ) := Kθ 1 γ(θ)dθ and = Q(θ) := Kθ 1 dθ. Introducing an integration constant S(X), which generally depends on X, leads to ln f(x; θ) = Q(θ)T (X) + D(θ) + S(X), and with C(θ) :=exp {D(θ)} and H(X) = exp {S(X)} f and γ have the claimed form with ˆγ(θ) = T (X).. Corollary 3.2.7: If under the regularity conditions of Theorem T is an unbiased estimator for γ(θ) which assumes the Cramér-Rao lower bound, then T is minimal sufficient and complete. An unbiased estimator, which assumes the CR-bound, is called an efficient estimator. In the scalar case the ratio e(t, θ) between the CR-bound and V ar θ (T ) is called the efficiency of the estimator T. Obviously, 0 e(t, θ) 1. When comparing two unbiased estimators T 1 and T 2, e θ (T 1 T 2 ) := V ar θ(t 2 ) V ar θ (T 1 ) is called the relative efficiency of T 1 with respect to T 2. lim e θ(t ) is called the asymptotic efficiency and n lim e θ(t 1 T 2 ) is called the asymptotic relative efficiency. n
11 3.3. METHOD OF MOMENTS Method of Moments Let P = {P θ θ Θ} and γ : Θ IR k. In many cases, the estimands γ(θ) can be written as functions of the moments of P θ, γ(θ) = g(µ 1,..., µ k). In order to estimate γ(θ), one then may try to estimate γ(θ) by replacing the unknown moments µ j, j = 1,..., k, by the corresponding sample moments. Let T be any statistic with existing expectation µ t (θ) := E θ (T (X)) for all θ Θ. Then the SLLN (Chinchine) entails T n := (T (X 1 ) + T (X 2 ) T (X n ))/n µ t (θ) a.s. If Θ IR k and a statistic T = (T 1,..., T k ) with existing expectation µ T (θ) = (µ t1 (θ),..., µ tk (θ)), then one can try to find an estimator ˆθ n = (ˆθ 1,n,..., ˆθ k,n ) as a solution of the system of equations ˆµ t1 (ˆθ 1,n..., ˆθ k,n ) = (T 1 (X 1 ) T 1 (X n ))/n =: T 1,n... ˆµ tk (ˆθ 1,n..., ˆθ k,n ) = (T k (X 1 ) T k (X n ))/n =: T k,n. Under regularity conditions we have then (SLLN) ˆγ(ˆθ n ) = g(ˆµ t1,..., ˆµ tk ) g(µ t1,..., µ tk ) = γ(θ). If the moments up to order 2k exist, then according to the Lindeberg-Levy Central Limit Theorem the asymptotic normality of ˆγ(ˆθ n ) can be proved. Remark: In general method of moments estimators are not unique, and they are in general not functions of sufficent statistics and so they cannot be efficient either. 3.4 Maximum Likelihood Estimation Def : A solution ˆθ of sup L(θ; x) (3.2) θ Θ
12 52 CHAPTER 3. POINT ESTIMATION is called a Maximum Likelihood Estimator for θ. With the ML principle one tries to find the mode of the underlying distribution. Since very often the mode as an estimator of location is worse than the mean or the median, ML estimators often have poor small sample properties. Often it is simpler in practice to work with the log-likelihood function l than with L. If the µ density f(x; θ) is positive µ a.e., if Θ IR k is an open set and if ( /)f(x; θ) exists on Θ, then a solution of 3.2 fulfills the likelihood equations θ l(θ; x) := log f(x; θ) = 0. (3.3) A solution of 3.3 is called a MLE in the weak sense, a solution of 3.2 is called a strict MLE. Theorem 3.4.1: Let Θ IR k and Λ IR p be intervals, p k, and let γ : Θ Λ be surjective. If ˆθ is MLE for θ, then γ(ˆθ) is MLE for γ(θ). Proof: For each λ Λ let Θ λ := {θ Θ γ(θ) = λ} and let M(λ; x) := sup θ Θ λ L(θ; x). Let ˆθ be a MLE for θ. Then ˆθ belongs to one of the sets Θ λ, e.g. to Θˆλ, and it holds M(ˆλ; x) = sup L(θ; x) L(ˆθ; x) and λ maximizes M, θ Θˆλ since we have M(ˆλ; x) sup λ Λ M(λ; x) = sup L(θ; x) = L(ˆθ; x). θ Θ Theorem 3.4.2: Let S be a sufficient statistic for P = {P θ θ Θ} µ (σ finite). If a unique MLE ˆθ exists, then it is a (measurable) function of S.
13 3.4. MAXIMUM LIKELIHOOD ESTIMATION 53 Proof: Since S is sufficient, there exists a factorization f(x; θ) = g(s(x); θ)h(x). Maximizing f with resprect to θ is hence equivalent to maximizing g with resprect to θ, and g is a function of S, and ˆθ depends on x only through S. Remark: If the lilkelihood equations (3.3) exist and if there exists a sufficient statistic S, then the MLE s are given as a solution of log g(s(x); θ) = 0. Theorem 3.4.3: Suppose that the regularity conditions of the CR inequality are satisfied and that θ belongs to an open interval in IR k. If T is an unbiased estimator for which the covariance matrix attains the CR lower bound, then the likelihood equations have the unique solution ˆγ(θ) = T (X). Proof: According to Theorem (resp. its multivariate version) there exists a regular matrix K θ such that K θ logf(x; θ) = T (X) γ(θ) µ a.e. and the likelihood equation have the unique solution ˆ γ(θ) = T (X). For large sample considerations we introduce the following regularity conditions: (A0) For θ θ f(x; θ) f(x; θ ) (identifiability). (A1) The support of f(x; θ), i.e. the set A := { x f(x; θ) > 0}, does not depend on θ Θ. (A2) The sample observations X 1,..., X n are iid with a density f(x; θ) with respect to some σ finite measure µ.
14 54 CHAPTER 3. POINT ESTIMATION (A3) The parameter space Θ contains an open set Θ 0 and the true θ 0 is an interior point of Θ 0. (A4) The density f(x; θ) is differentiable for µ almost all x with respect to θ Θ 0 with derivative f(x; θ) := f(x; θ). Theorem 3.4.4: Let (A0) (A2) hold. Then P θ0 [L(θ 0 ; x) > L(θ; x)] 1 for n and for all θ θ 0. (3.4) Proof: For the proof we refer to Jensen s inequality, according to which for φ konvex on an open interval I with P (X I) = 1 and E(X) < φ[e(x)] E[φ(X)]. (A0) implies 1 n for all θ θ 0. n log[f(x i ; θ)/f(x i ; θ 0 )] < 0 i=1 According to the SLLN the left hand side converges a.s. to E θ0 [log {f(x; θ)/f(x; θ 0 )}]. Since log(.) is a strictly convex function, Jensen s inequality yields E θ0 [log {f(x; θ)/f(x; θ 0 )}] < log {E θ0 [f(x; θ)/f(x; θ 0 )]}, where the right hand side is equal to zero. This entails (3.3) If therefore the density f is a smooth function of θ, then one may expect that the MLE for θ will lie close to θ 0.
15 3.4. MAXIMUM LIKELIHOOD ESTIMATION 55 Theorem 3.4.5: Let (A0) (A4) hold. Then, with probability going to 1 the likelihood equations n l(θ; x) = 0, f(x j ; θ) f(x j ; θ) = 0 j=1 have a solution ˆθ n with ˆθ n θ 0 in probability for n. Proof: Let δ be sufficiently small such that (according to (A3)) (θ 0 δ, θ 0 + δ) Θ 0 and let S n := {x l(θ 0 x) > l(θ 0 δ x)and l(θ 0 x) > l(θ 0 + δ x)}. According to Theorem P θ0 (S n ) 1 for n. For each x S n there is hence a ˆθ n with θ 0 δ < ˆθ n < θ 0 + δ, where l(θ; x) takes a local maximum and therefore l(ˆθ n ) = 0. This entails that for each small enough δ there exists a sequence ˆθ n = ˆθ n (δ) of solustions, such that P θ0 ( ˆθ n θ 0 < δ) 1 for n. It remains to show that such a sequence exists which does not depend on δ. Let θn be the solution closest to θ 0. (It exists, since, because of the continuity of l(θ) the limes of a sequence of solutions is itself a solution.) Then it naturally holds that P θ0 ( θn θ 0 < δ) 1 for all δ > 0.. Remark: If the solutions are not unique, then the above Theorem does not yield a consistent sequence of estimators. θ 0 is unknown and the data don t tell you which root to choose. In order to show asymptotic efficiency for the univariate case further regularity conditions are needed:
16 56 CHAPTER 3. POINT ESTIMATION (A5) Θ IR is an open interval. (A6) For x A the density f(x; θ) is three times continuously differentiable with respect to θ. (A7) The integral f(x; θ)µ(dx) can be differenciated three times with respect to θ under the integral sign. (A8) For the Fisher information 0 < I(θ) < holds. (A9) To every θ 0 Θ there exists a δ > 0 and a function M(x) (both may depend on θ 0 ) such that 3 log f(x; θ) M(x) 3 for all x A, θ 0 δ < θ < θ 0 + δ with E θ0 [M(x)] <. Theorem 3.4.6: Let the conditions (A1), (A2), (A5) (A9) hold. Then for each consistent sequence ˆθ n of solutions of the likelihood equations holds. n(ˆθn θ 0 ) L N (0, I(θ) 1 ) Proof: For every fixed x A a Taylor series expansion of l(ˆθ n ) around θ 0 yields 0 = log(fx; ˆθ n ) = log (f(x; θ 0)) where θ n lies between θ 0 and ˆθ n With obvious abbreviations this is equal to + (ˆθn θ ) 2 log (f(x; θ 0 )) 2 ) 2 (ˆθn 3 log (f(x; θ θ n)) 0, 3 0 = l(ˆθ n ) = l(θ 0 ) + (ˆθ n θ 0 ) l(θ o ) (ˆθ n θ 0 ) 2 l (θ n)
17 3.4. MAXIMUM LIKELIHOOD ESTIMATION 57 or (ˆθ n θ 0 ) [ l(θ0 ) (ˆθ n θ 0 ) ] l (θn) = l(θ 0 ), and for the expression [...] 0 we obtain n(ˆθn θ 0 ) = n 1 l(θ n 0 ) n l(θ 1 0 ) 1 (ˆθ 2n n θ 0 ). (3.4) l (θn) In Theorem we have already shown that (ˆθ n θ 0 ) converges to zero in probability for n. We will now show that (1) n 1/2 l(θ0 ) converges weakly to a N (0, I(θ 0 )), (2) n 1 l(θ0 ) converges to I(θ 0 ) > 0 a.s. resp. in probability (3) 1 n l (θn) is stochastically bounded. (1): n 1/2 l(θ0 ) = n 1 n n i=1 log (f(x i ; θ 0 )) =: nb n, where according to the SLLN B n converges a.s. to [ ] log (f(x; θ0 )) B 0 = E θ0 = 0. 0 According to the CLT n [B n 0] converges in distribution to a normal distribution with expected value equal to zero and variance E [ ( ) ] 2 B0 2 = E logf(x; θ0 ) = I(θ 0 ) where I(θ 0 ) > 0 according to (A8).
18 58 CHAPTER 3. POINT ESTIMATION (2): Since with l = log f(x; θ), l f =, l = f we have f. f ( f) 2 f 2 n 1 l(θ0 ) = 1 n n f(x i ; θ 0 ) 2 f(x i ; θ 0 ) f(x i ; θ 0 ). f 2 (X i ; θ 0 ) i=1 According to the SLLN this term converges (a.s. probability) to I(θ 0 ), since and hence also in E θ0 [ 1 n ( f 2 f f )] = 1 2 f n = f 2 f dµ = E θ0 f 2 f dµ fdµ }{{} =0 [ 2 ] log(f(x; θ 0 ) = I(θ 2 0 ). 1 (3): Finally n l (θn) = 1 n n i=1 3 3 log(f(x i ; θ n)), and with (A9) we get 1 n l (θ n) 1 n [M(X 1) M(X n )]. The right hand side converges to E θ n [M(x)] < according to (A9). Since (ˆθ n θ 0 ) converges to zero in probability according to Theorem 3.4.5, the second term in the denominator of (3.4) converges to zero as well. Putting (1) to (3) together we have shown that n(ˆθ n θ 0 ) converges weakly to a N (0, I(θ 0 ) 1 ). Remarks: (1) A sequence of estimators which fulfils the conditions of Theorem is called an efficient likelihood estimator.
19 3.4. MAXIMUM LIKELIHOOD ESTIMATION 59 (2) (A6), (A7) entail for all θ Θ 0 [ ] log f(x; θ) (i) E = 0 and [ ] [ ( ) ] (ii) E 2 log f(x; θ) log f(x; θ) 2 2 = E = I(θ). Corollary 3.4.7: Let the conditions of Theorem hold. If the likelihood equations have a unique solution for all x and n resp. if the probability for multiple roots goes to zero for n, then the MLE is asymptotically efficient. Some final remarks: (1) In general, the likelihood equations (2) cannot be resolved explicitely. In this case the roots can be found only by using numerical procedures. (Problems of existence, uniqueness and convergence of solutions for used algorithms). (2) MLE s need strong prerequisites (conditions). Under certain conditions consistency and asymptotic normality still hold, even if the distributional assumptions do not exactly coincide with reality. But in this case asymptotic efficiency gets lost: Already small deviations between reality and model assumptions can lead to a considerable loss of efficiency. (3) Consistency and asymptotic normality may hold even if some regularity conditions of the above Theorems are violated. For the multivariate case Θ IR k a result like Theorem can be obtained in a similar way, if the conditions (A5),... are reformulated accordingly.
Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationP n. This is called the law of large numbers but it comes in two forms: Strong and Weak.
Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to
More informationENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM
c 2007-2016 by Armand M. Makowski 1 ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM 1 The basic setting Throughout, p, q and k are positive integers. The setup With
More informationMethods of evaluating estimators and best unbiased estimators Hamid R. Rabiee
Stochastic Processes Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee 1 Outline Methods of Mean Squared Error Bias and Unbiasedness Best Unbiased Estimators CR-Bound for variance
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationMaximum Likelihood Estimation
Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationf(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain
0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher
More informationMaximum Likelihood Estimation
Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function
More informationChapter 8.8.1: A factorization theorem
LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.
More informationChapter 4. Theory of Tests. 4.1 Introduction
Chapter 4 Theory of Tests 4.1 Introduction Parametric model: (X, B X, P θ ), P θ P = {P θ θ Θ} where Θ = H 0 +H 1 X = K +A : K: critical region = rejection region / A: acceptance region A decision rule
More informationMath 494: Mathematical Statistics
Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/
More information1. Fisher Information
1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score
More informationEconomics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation
Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 9: Asymptotics III(MLE) 1 / 20 Jensen
More informationFor iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.
Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationParameter Estimation
Parameter Estimation Consider a sample of observations on a random variable Y. his generates random variables: (y 1, y 2,, y ). A random sample is a sample (y 1, y 2,, y ) where the random variables y
More information557: MATHEMATICAL STATISTICS II BIAS AND VARIANCE
557: MATHEMATICAL STATISTICS II BIAS AND VARIANCE An estimator, T (X), of θ can be evaluated via its statistical properties. Typically, two aspects are considered: Expectation Variance either in terms
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationECE 275B Homework # 1 Solutions Winter 2018
ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,
More informationECE 275B Homework # 1 Solutions Version Winter 2015
ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationMathematical Statistics
Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution
More informationChapter 4: Asymptotic Properties of the MLE
Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider
More informationClassical Estimation Topics
Classical Estimation Topics Namrata Vaswani, Iowa State University February 25, 2014 This note fills in the gaps in the notes already provided (l0.pdf, l1.pdf, l2.pdf, l3.pdf, LeastSquares.pdf). 1 Min
More information6.1 Variational representation of f-divergences
ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 6: Variational representation, HCR and CR lower bounds Lecturer: Yihong Wu Scribe: Georgios Rovatsos, Feb 11, 2016
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationRecall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n
Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible
More information5.2 Fisher information and the Cramer-Rao bound
Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationLecture 1: Introduction
Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One
More informationDA Freedman Notes on the MLE Fall 2003
DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationEvaluating the Performance of Estimators (Section 7.3)
Evaluating the Performance of Estimators (Section 7.3) Example: Suppose we observe X 1,..., X n iid N(θ, σ 2 0 ), with σ2 0 known, and wish to estimate θ. Two possible estimators are: ˆθ = X sample mean
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationNotes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed
18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,
More informationSTATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN
Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 2: MODES OF CONVERGENCE AND POINT ESTIMATION Lecture 2:
More informationECE 275A Homework 7 Solutions
ECE 275A Homework 7 Solutions Solutions 1. For the same specification as in Homework Problem 6.11 we want to determine an estimator for θ using the Method of Moments (MOM). In general, the MOM estimator
More informationChapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic
Chapter 3: Unbiased Estimation Lecture 22: UMVUE and the method of using a sufficient and complete statistic Unbiased estimation Unbiased or asymptotically unbiased estimation plays an important role in
More informationInference in non-linear time series
Intro LS MLE Other Erik Lindström Centre for Mathematical Sciences Lund University LU/LTH & DTU Intro LS MLE Other General Properties Popular estimatiors Overview Introduction General Properties Estimators
More information3.1 General Principles of Estimation.
154 Chapter 3 Basic Theory of Point Estimation. Suppose X is a random observable taking values in a measurable space (Ξ, G) and let P = {P θ : θ Θ} denote the family of possible distributions of X. An
More informationProof In the CR proof. and
Question Under what conditions will we be able to attain the Cramér-Rao bound and find a MVUE? Lecture 4 - Consequences of the Cramér-Rao Lower Bound. Searching for a MVUE. Rao-Blackwell Theorem, Lehmann-Scheffé
More informationChapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program
Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationECE 275A Homework 6 Solutions
ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =
More informationThe properties of L p -GMM estimators
The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion
More informationLecture 3 September 1
STAT 383C: Statistical Modeling I Fall 2016 Lecture 3 September 1 Lecturer: Purnamrita Sarkar Scribe: Giorgio Paulon, Carlos Zanini Disclaimer: These scribe notes have been slightly proofread and may have
More informationSTAT215: Solutions for Homework 2
STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator
More informationGraduate Econometrics I: Unbiased Estimation
Graduate Econometrics I: Unbiased Estimation Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Unbiased Estimation
More information1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).
Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,
More informationSection 8.2. Asymptotic normality
30 Section 8.2. Asymptotic normality We assume that X n =(X 1,...,X n ), where the X i s are i.i.d. with common density p(x; θ 0 ) P= {p(x; θ) :θ Θ}. We assume that θ 0 is identified in the sense that
More informationf(y θ) = g(t (y) θ)h(y)
EXAM3, FINAL REVIEW (and a review for some of the QUAL problems): No notes will be allowed, but you may bring a calculator. Memorize the pmf or pdf f, E(Y ) and V(Y ) for the following RVs: 1) beta(δ,
More informationTesting Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata
Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function
More informationStatistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That
Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval
More informationStatistical Inference
Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA. Asymptotic Inference in Exponential Families Let X j be a sequence of independent,
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More informationChapter 7. Hypothesis Testing
Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 5, 2011 Lecture 13: Basic elements and notions in decision theory Basic elements X : a sample from a population P P Decision: an action
More informationA Few Notes on Fisher Information (WIP)
A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties
More informationStat 411 Lecture Notes 03 Likelihood and Maximum Likelihood Estimation
Stat 411 Lecture Notes 03 Likelihood and Maximum Likelihood Estimation Ryan Martin www.math.uic.edu/~rgmartin Version: August 19, 2013 1 Introduction Previously we have discussed various properties of
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationPrinciples of Statistics
Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for
More informationZ-estimators (generalized method of moments)
Z-estimators (generalized method of moments) Consider the estimation of an unknown parameter θ in a set, based on data x = (x,...,x n ) R n. Each function h(x, ) on defines a Z-estimator θ n = θ n (x,...,x
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationStat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota
Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we
More informationParameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!
Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn! Questions?! C. Porciani! Estimation & forecasting! 2! Cosmological parameters! A branch of modern cosmological research focuses
More informationLecture 8: Information Theory and Statistics
Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang
More informationProbability on a Riemannian Manifold
Probability on a Riemannian Manifold Jennifer Pajda-De La O December 2, 2015 1 Introduction We discuss how we can construct probability theory on a Riemannian manifold. We make comparisons to this and
More informationFinal Examination Statistics 200C. T. Ferguson June 11, 2009
Final Examination Statistics 00C T. Ferguson June, 009. (a) Define: X n converges in probability to X. (b) Define: X m converges in quadratic mean to X. (c) Show that if X n converges in quadratic mean
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Wednesday, October 19, 2011 Lecture 17: UMVUE and the first method of derivation Estimable parameters Let ϑ be a parameter in the family P. If there exists
More information1. (Regular) Exponential Family
1. (Regular) Exponential Family The density function of a regular exponential family is: [ ] Example. Poisson(θ) [ ] Example. Normal. (both unknown). ) [ ] [ ] [ ] [ ] 2. Theorem (Exponential family &
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More informationReview and continuation from last week Properties of MLEs
Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that
More information1 Likelihood. 1.1 Likelihood function. Likelihood & Maximum Likelihood Estimators
Likelihood & Maximum Likelihood Estimators February 26, 2018 Debdeep Pati 1 Likelihood Likelihood is surely one of the most important concepts in statistical theory. We have seen the role it plays in sufficiency,
More informationSection 8: Asymptotic Properties of the MLE
2 Section 8: Asymptotic Properties of the MLE In this part of the course, we will consider the asymptotic properties of the maximum likelihood estimator. In particular, we will study issues of consistency,
More informationParametric Inference
Parametric Inference Moulinath Banerjee University of Michigan April 14, 2004 1 General Discussion The object of statistical inference is to glean information about an underlying population based on a
More informationSTA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources
STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various
More informationAn exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.
Stat 8112 Lecture Notes Asymptotics of Exponential Families Charles J. Geyer January 23, 2013 1 Exponential Families An exponential family of distributions is a parametric statistical model having densities
More informationGraduate Econometrics I: Maximum Likelihood I
Graduate Econometrics I: Maximum Likelihood I Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More informationExpectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 )
Expectation Maximization (EM Algorithm Motivating Example: Have two coins: Coin 1 and Coin 2 Each has it s own probability of seeing H on any one flip. Let p 1 = P ( H on Coin 1 p 2 = P ( H on Coin 2 Select
More informationi=1 h n (ˆθ n ) = 0. (2)
Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating
More informationAnswers to the 8th problem set. f(x θ = θ 0 ) L(θ 0 )
Answers to the 8th problem set The likelihood ratio with which we worked in this problem set is: Λ(x) = f(x θ = θ 1 ) L(θ 1 ) =. f(x θ = θ 0 ) L(θ 0 ) With a lower-case x, this defines a function. With
More information5601 Notes: The Sandwich Estimator
560 Notes: The Sandwich Estimator Charles J. Geyer December 6, 2003 Contents Maximum Likelihood Estimation 2. Likelihood for One Observation................... 2.2 Likelihood for Many IID Observations...............
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationTheory of Statistics.
Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased
More informationElements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
More informationSOLUTION FOR HOMEWORK 7, STAT p(x σ) = (1/[2πσ 2 ] 1/2 )e (x µ)2 /2σ 2.
SOLUTION FOR HOMEWORK 7, STAT 6332 1. We have (for a general case) Denote p (x) p(x σ)/ σ. Then p(x σ) (1/[2πσ 2 ] 1/2 )e (x µ)2 /2σ 2. p (x σ) p(x σ) 1 (x µ)2 +. σ σ 3 Then E{ p (x σ) p(x σ) } σ 2 2σ
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationInformation in a Two-Stage Adaptive Optimal Design
Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationGraduate Econometrics I: Maximum Likelihood II
Graduate Econometrics I: Maximum Likelihood II Yves Dominicy Université libre de Bruxelles Solvay Brussels School of Economics and Management ECARES Yves Dominicy Graduate Econometrics I: Maximum Likelihood
More information