Section 10: Role of influence functions in characterizing large sample efficiency

Size: px
Start display at page:

Download "Section 10: Role of influence functions in characterizing large sample efficiency"

Transcription

1 Section 0: Role of influence functions in characterizing large sample efficiency. Recall that large sample efficiency (of the MLE) can refer only to a class of regular estimators. 2. To show this here, we will first show a characterization of a class of regular estimators through its influence function. 3. To show this characterization, we will use the concept of contiguity.

2 2 Section 0. Asymptotically linear estimators Let X n =(X,X 2,...,X n ), where the X i s are i.i.d. p(x; θ 0 ) P= {p(x; θ) :θ Θ}. Suppose that we are interested in estimating γ(θ), where γ( ) :Θ R k and Θ R q. Most estimators that we will consider are asymptotically linear. That is, there exists a random vector ϕ(x) such that n(ˆγ(xn ) γ 0 )= n n i= ϕ(x i )+o P () with E θ0 [ϕ(x)] = 0 and E θ0 [ϕ(x)ϕ(x) ] is finite and non-singular.

3 3 The function ϕ(x) is referred to as an influence function. The phrase influence function was used by Hampel (JASA, 974) and is motivated by the fact that to the first order ϕ(x) is the influence of a single observation on the estimator ˆγ(X n ). Consider ˆγ(x,...,x n,x) as a function of x. Since n(ˆγ(xn ) γ 0 ) n ( n i= ϕ(x i)+ϕ(x)), we see that ˆγ(x n ) γ 0 + n n i= ϕ(x i )+ n ϕ(x)

4 4 The asymptotic properties of an asymptotically linear estimator, ˆγ(X n ) can be summarized by considering only its influence function. Since ϕ(x) has mean zero, the CLT tells us that n n i= ϕ(x i ) D N(0,E θ0 [ϕ(x)ϕ(x) ]) By Slutsky s theorem, we know that n(ˆγ(xn ) γ 0 ) D N(0,E θ0 [ϕ(x)ϕ(x) ])

5 5 Influence Function for GMM Estimators From the previous section, we know that n(ˆγ(x n) γ 0 ) = { ˆD(ˆγ(Xn); Xn) ŴnD n (X n)} ˆD(ˆγ(X n); Xn) Ŵn = {D WD} D W = n n g(x i ; γ 0 )+o P () i= {D WD} D Wg(X i ; γ 0 )+o P () i= n g(x i ; γ 0 ) i= Hence, the influence function is ϕ(x) = {D WD} D Wg(x; γ 0 ) When the dimension of g is the same as γ and D and W are full rank, the influence function is ϕ(x) = D g(x; γ 0 )

6 6 Example: GEE TheGEEisthesolutionto n A(X i ; γ)(y i µ(x i ; γ)) = 0 i= As a special case of a GMM estimator, we showed that n(ˆγ(x n) γ 0 )= n E θ0 [A(X; γ 0 )M(X; γ 0 )] A(X i ; γ 0 )(Y i µ(x i ; γ 0 )) + o P () i= Therefore, the influence function is ϕ(y, x) = E θ0 [A(X; γ 0 )M(X; γ 0 )] A(x; γ 0 )(y µ(x; γ 0 )) The influence function associated with the optimal GEE is ϕ(y, x) = E θ0 [M(X; γ 0 ) Var θ0 [Y X] M(X; γ 0 )] M(x; γ 0 ) Var θ0 [Y x] (y µ(x; γ 0 ))

7 7

8 8 Lemma: An asymptotically linear estimator must have a unique (a.s.) influence function. Proof: Suppose the influence function is not unique. That is, there exists another influence function ϕ (x) such that E θ0 [ϕ (X)] = 0 and Since, we know that n(ˆγ(xn ) γ 0 )= n n(ˆγ(xn ) γ 0 )= n n n i= n i= n i= ϕ (X i )+o P () ϕ(x i )+o P () {ϕ(x i ) ϕ (X i )} = o P ()

9 9 By the CLT, we know that n n i= {ϕ(x i ) ϕ (X i )} D N(0,Var θ0 [ϕ(x) ϕ (X)]) In order for this limiting distribution to be o P (), Var θ0 [ϕ(x) ϕ (X)] = 0 or ϕ(x) ϕ (X) =0a.e.

10 0 In parametric models (i.e., Θ is finite-dimensional), if an estimator is asymptotically linear and regular (RAL), then we can establish some useful results regarding the geometry of the influence function as well as precisely defining efficiency. We assume that γ(θ) is continuously differentiable at least in a neighborhood of the truth θ 0.Letγ 0 = γ(θ 0 ).

11 Recall the definition of regular estimator (Section 8) Definition: Consider a local data generating process (LDGP), where for each n, the data are distributed according to θ n,where n(θn θ 0 ) τ. An estimator ˆγ(X n )issaidtoberegular if for each θ 0, n(ˆγ(x n ) γ(θ n )) has a limiting distribution that does not depend on the LDGP. So, if n(ˆγ(xn ) γ 0 ) D(θ 0) N(0, Σ 0 ) then n(ˆγ(xn ) γ(θ n )) D(θ n) N(0, Σ 0 ) We say that convergence is uniform in a shrinking neighborhood of the true parameter value. Regularity rules out superefficient estimators (see the example due to Hodges).

12 2 Section 0.2 Contiguity Most density functions in the parametric models with the kind of regularity conditions we have imposed have the property that the sequence of distributions P n (X n ; θ n )arecontiguous to the sequence of distributions P n (X n ; θ 0 ), where n(θ n θ 0 ) τ. More formally, consider the following definition: Definition: Consider a sequence of probability measures {P n,q n } on the sample spaces X n with respect to common dominating measures µ n, defined on the σ-algebras F n.let{p n,q n } be the associated sequence of densities. If for any sequence of events {A n } with A n F n, P n (A n ) 0 Q n (A n ) 0 then we say that the densities {q n } are contiguous to the densities {p n }.

13 3 This is especially useful since contiguity implies that any sequence of random variables converging to zero in P n probability also converge to zero in Q n probability. To see this, suppose that P Z n n 0. This means that for all ɛ>0, P n [ Z n >ɛ] 0 as n. Contiguity implies that Q n as n.thusz n 0. Q n [ Z n >ɛ] 0

14 4 A useful sufficient condition for identifying when a sequence of probability densities are contiguous is given by LeCam s first lemma. LeCam s First Lemma First, we give some preliminary notation and remind you about the results of the Neyman-Pearson lemma (see Section 8.3 of Casella and Berger) for finding most powerful tests. For a fixed sample size n, considerp n (x n ) to be the fixed null hypothesis and q n (x n ) to be the fixed alternative. The Neyman-Pearson lemma states the following:

15 5 For any level 0 α n, the most powerful test (randomized) is given by the test function φ n defined as 0 q n <k n p n (Do not reject the null) φ n = ξ n q n = k n p n (Reject the null with probability ξ n ) q n >k n p n (Reject the null) Such a φ n can always be defined so that the level is equal to α n, i.e., α n = φ n dp n In particular, for any sequence of events {A n } F n,a corresponding sequence of test functions can be constructed so that α n = P n (A n )= φ n dp n Clearly, another sequence of level α n test functions is given by

16 φ n(x n )=I(x n A n )sinceα n = P n (A n )= φ ndp n. 6

17 7 However, the N-P Lemma tells us that the most powerful test is given by the likelihood ratio test φ n (x n ). Therefore, Power of φ n = φ n dq n φ ndq n =Powerofφ n = Q n [A n ] This implies that to show contiguity of {q n } to {p n } it suffices to show that for the test functions φ n defined above that φ n dp n = P n [A n ] 0 φ n dq n 0

18 8 Next we introduce a sequence of likelihood ratio random variables: L n (x n )= q n(x n ) p n (x n ) More specifically, we assume that q n (x n )/p n (x n ) p x (x n ) > 0 L n (x n )= p n (x n )=q n (x n ) p n (x n )=0;q n (x n ) > 0 Let F n be the distribution of L n under P n, i.e., F n (u) =P n [L n (X n ) u]

19 9 Lemma 0. (LeCam s First Lemma): Assume that F n converges to the distribution function F such that 0 udf (u) =. Then, the densities {q n } are contiguous to the densities {p n }. Proof: Take a sequence of test functions φ n such that

20 20 φn dp n 0. Note that φ n dq n = = L n y L n y y = y = y = y = y = y φ n dq n + φ n dq n L n >y q n φ n p n dµ + φ n dq n p n L n >y φ n dp n + dq n L n >y φ n dp n + dq n L n y φ n dp n + L n dp n L n y φ n dp n + I(L n y)l n dp n φ n dp n + E Pn [I(L n y)l n ] φ n dp n + y 0 udf n (u)

21 2 Now, we can find a value y such that y udf (u) <ɛ/4. Since 0 F n F, we know that y 0 udf n(u) y udf (u). Therefore, there 0 is some N 0 such that for all n>n 0, y 0 udf n(u) <ɛ/2(use triangle inequality). Furthermore, since φ n dp n 0, we know that there exists N so that for all n>n, y φ n dp n <ɛ/2. Therefore, φ n dq n <ɛfor n>max(n 0,N ). Thus, we have shown that φ n dq n 0. Thus, we have contiguity of the sequence {q n } to the sequence {p n }.

22 22 Corollary 0.2: If log{l n } D(P n) N( σ 2 /2,σ 2 ), then the densities {q n } are contiguous to the densities {p n }. Proof: We see that L n converges in distribution to a log-normal random variable. Verify that the expectation of a log normal random variable with mean σ 2 /2andvarianceσ 2 is equal to.

23 23 We can know apply Corollary 0.2 to our i.i.d. situation. We know that p n (x n )= n i= p(x i; θ 0 )andq n (x n )= n i= p(x i; θ n ), where n(θn θ 0 ) τ. Now, log L n (X n )=log{ q n n(x n ) p n (X n ) } = {log p(x i ; θ n ) log p(x i ; θ 0 )} i= By mean value expansion, we see that n log p(x i ; θ n ) = i= n log p(x i ; θ 0 )+ i= 2 (θ n θ 0 ) n i= n i= log p(x i ; θ 0 ) θ (θ n θ 0 )+ 2 log p(x i ; θ n) θ θ (θ n θ 0 )

24 24 This implies that log L n (X n ) = D n n i= ψ(x i ; θ 0 ) n(θ n θ 0 )+ 2 n(θn θ 0 ) n n i= N( 2 τ I(θ 0 )τ,τ I(θ 0 )τ) 2 log p(x i ; θ n) θ θ n(θn θ 0 ) This satisfies the corollary with σ 2 = τ I(θ 0 )τ, thusprovingthat the densities { n i= p(x i; θ n )} are contiguous to { n i= p(x i; θ 0 )}.

25 25 Section 0.3 Characterization of regular asymptotic linear estimators Theorem 0.3: Suppose that ˆγ(X n ) is an asymptotically linear estimator with influence function ϕ(x) ande θ [ϕ(x)ϕ(x) ]exists and is continuous in the neighborhood of θ 0. Then, ˆγ(X n )is regular if and only if γ(θ 0 ) θ = E θ0 [ϕ(x)ψ(x; θ 0 ) ] Proof: By the definition of an influence function, we know that n(ˆγ(xn ) γ 0 )= n n i= ϕ(x i )+o P (θ0 )()

26 26 That is, P θ0 [ n(ˆγ(xn ) γ 0 ) n n i= ϕ(x i ) >ɛ] 0 By contiguity, we also know that P θn [ n(ˆγ(xn ) γ 0 ) n n i= ϕ(x i ) >ɛ] 0 or [ n(ˆγ(x n ) γ 0 ) n n i= ϕ(x i)] P (θ n) subtracting similar terms, we find that 0. Adding and n(ˆγ(x n) γn) n (ϕ(x i ) E θ n [ϕ(x)]) + n(γn γ 0 ) ne θ n [ϕ(x)] P (θ n) 0 i=

27 27 or n(ˆγ(xn ) γ n ) = n (ϕ(x i ) E θn [ϕ(x)]) () n i= n(γn γ 0 )+ (2) neθn [ϕ(x)] + (3) o P (θn )()

28 28 We can invoke the CLT for triangular arrays to show that () converges in distribution to N(0,E θ0 [ϕ(x)ϕ(x) ]) random vector. For (2), we start by using the mean value theorem to show that γ n = γ(θ n )=γ 0 + γ(θ n) θ (θ n θ 0 ) This implies that n(γ n γ 0 )= γ(θ n ) θ n(θn θ 0 ) γ(θ 0) θ τ.for (3), we know that neθn [ϕ(x)] = nz ϕ(x)p(x; θ n )dµ(x) = n Z = n{ Z = Z ϕ(x) ϕ(x){p(x; θ 0 )+ p(x; θ n) θ (θ n θ 0 )}dµ(x) ϕ(x)p(x; θ 0 )dµ(x)+ Z ϕ(x) p(x;θ n ) θ p(x; θ 0 ) p(x; θ 0) n(θ n θ 0 )dµ(x) p(x;θ n ) θ p(x; θ 0 ) p(x; θ 0)(θ n θ 0 )dµ(x)}

29 29 Thus, neθn [ϕ(x)] ϕ(x)ψ(x; θ 0 ) τp(x; θ 0 )dµ(x) =E θ0 [ϕ(x)ψ(x; θ 0 ) ]τ Putting all of these results together, we find that n(ˆγ(xn ) γ n ) D(θ n) N(E θ0 [ϕ(x)ψ(x; θ 0 ) ]τ γ(θ 0) θ τ,e θ0 [ϕ(x)ϕ(x) ]) In order for ˆγ(X n ) to be regular, we must have E θ0 [ϕ(x)ψ(x; θ 0 ) ] γ(θ 0) θ =0 which gives the desired result. The other direction is obviously true.

30 30 Section 0.4 Geometry and Efficiency of influence functions among regular asymptotically linear estimators Suppose that we parameterize our problem so that θ =(γ,λ ).In this case, γ(θ 0 ) θ =[[I] [0] ] = [ [E θ0 [ϕ(x)ψ γ (X; θ 0 ) ]] [E θ0 [ϕ(x)ψ λ (X; θ 0 ) ]]] Consider the Hilbert space, H, ofk-dimensional functions of X with mean zero and finite second moments, equipped with inner product <h,h 2 >= E θ0 [h (X) h 2 (X)]. The nuisance tangent space is the linear subspace Λ={Bψ λ (X; θ 0 ): B is an arbitrary k q matrix of real numbers} The theorem tells us that the influence function, ϕ(x), of a RAL estimator, ˆγ(X n ), must be orthogonal to every element in Λ. That is, ϕ(x) Λ = {h(x) H: E θ0 [h(x) Bψ λ (X; θ 0 )] = 0}. In

31 addition, E θ0 [ϕ(x)ψ γ (X; θ 0 )] = I. 3

32 32 Can we show how take elements from Λ to construct an estimating equation, whose solution is regular and asymptotically linear? As we will see, the random vectors in Λ depend on θ 0.Considerany particular element in Λ,sayh(X; θ 0 ). Now, we know that whatever be γ 0 Γandλ 0 λ. E γ0,λ 0 [h(x; γ 0,λ 0 )] = 0 For given γ, letˆλ n (γ) be a profile estimator of λ. Assume that ˆλ n (γ) is continuously, differentiable function of γ in a neighborhood of γ 0 and ˆλ n (γ 0 ) γ converges in probability to d(θ 0 ). For some ɛ>0, n /4+ɛ (ˆλ n (γ 0 ) λ 0 ) is bounded in probability.

33 33 For example, suppose that for fixed γ, ˆλ n (γ) solves n i= m(x i,γ,λ)=0,wherem(x; γ,λ) isq-dimensional estimating function and E θ0 [m(x; γ 0,λ 0 )] = 0. It can be shown that n(ˆλn (γ 0 ) λ 0 ) is bounded in probability. By the implicit function theorem, we know 0 = P n i= m(x i,γ 0, ˆλn(γ 0 )) γ = i= m(x i,γ 0, ˆλn(γ 0 )) γ + i= m(x i,γ 0, ˆλn(γ 0 )) λ ˆλn(γ 0 ) γ This implies that ˆλn(γ 0 ) γ = 2 4 n i= m(x i,γ 0, ˆλn(γ 0 )) λ n 3 5 i= m(x i,γ 0, ˆλn(γ 0 )) P E θ0 [ m(x, γ 0,λ 0 ) ] E θ0 [ m(x, γ 0,λ 0 ) ] d(θ 0 ) λ γ γ

34 34 Now, estimate γ 0 as the solution, ˆγ n,to n h(x i ; γ, ˆλ n (γ)) = 0 i= Under regularity conditions, it can be shown that ˆγ n is a consistent estimator of γ 0. What is the influence function for ˆγ n? By mean value expansions, we know that 0 = = n n h(x i ;ˆγn, ˆλn(ˆγn)) i= h(x i ; γ 0, ˆλn(γ 0 )) + i= 2 4 n i= h(x i ; γ n, ˆλn(γ n )) γ + n i= h(x i ; γ n, ˆλn(γ n )) λ ˆλn(γ n ) γ 3 5 n(ˆγ n γ 0 )

35 35 n(ˆγ n γ 0 ) = 2 4 n i= h(x i ; γ n, ˆλn(γ n )) γ n + n i= h(x i ; γ n, ˆλn(γ n )) λ ˆλn(γ n ) γ 3 5 h(x i ; γ 0, ˆλn(γ 0 )) (4) i= Now, for each component of h(x, γ 0, ˆλ n (γ 0 )), we know that n h j (X i ; γ 0, ˆλn(γ 0 )) = i= n n h j (X i ; γ 0,λ 0 )+ i= i= h(x i ; γ 0,λ 0 ) λ n /4+ɛ n/4+ɛ (ˆλn(γ 0 ) λ 0 )+ 2 n /4 (ˆλn(γ 0 ) λ 0 ) { n i= 2 h j (X i ; γ 0,λ n ) λ λ }n /4 (ˆλn(γ 0 ) λ 0 ) Thus, n h j (X i ; γ 0, ˆλn(γ 0 )) = i= n h j (X i ; γ 0,λ 0 )+o P () (5) i=

36 36 Plugging (5) into (4), we get n(ˆγ n γ 0 ) = 2 4 n i= h(x i ; γ n, ˆλn(γ n )) γ { n + n i= h(x i ; γ n, ˆλn(γ n )) λ ˆλn(γ n ) γ 3 5 h(x i ; γ 0,λ 0 )) + o P ()} (6) i= Since, we can show that n i= h(x i ; γ n, ˆλn(γ n )) γ P E θ0 [ h(x i ; γ 0,λ 0 ) ]= E θ0 [h(x; θ 0 )ψγ (X; θ 0 ) ] γ n i= h(x i ; γ n, ˆλn(γ n )) λ P E θ0 [ h(x i ; γ 0,λ 0 ) ]= E θ0 [h(x; θ 0 )ψ λ (X; θ 0 ) ]=0 λ ˆλn(γ n ) γ P d(θ 0 )

37 37 n h(x i ; γ 0,λ 0 )) + o P () D N(0,E θ0 [h(x i ; γ 0,λ 0 )h(x i ; γ 0,λ 0 ) ]) i= we can conclude that n(ˆγ n γ 0 )= n i= E θ0 [ h(x i ; γ 0,λ 0 ) ] h(x i ; γ 0,λ 0 )+o P () γ

38 38 Thus, n(ˆγ n γ 0 ) D N(0,E θ0 [ h(x i ; γ 0,λ 0 ) ] E θ0 [h(x i ; γ 0,λ 0 )h(x i ; γ 0,λ 0 ) ]E θ0 [ h(x i ; γ 0,λ 0 ) ] ) γ γ So, the influence function for ˆγ n is E θ0 [ h(x i;γ 0,λ 0 ) γ ] h(x; γ 0,λ 0 ), which is the influence function that we would have derived if we had pretended as if λ 0 were known.

39 39 Algorithm. Select an element from the orthogonal complement of the nuisance tangent space. 2. Find an (profile) estimator for the nuisance parameter. 3. The influence function for the resulting estimator will be the same as if the nuisance parameter had been known. 4. This procedure carries over to the case where the nuisance parameter is infinite dimensional!

40 40 Efficiency A geometric interpretation can be useful in helping us define the most efficient estimator among the class of regular, asymptotically linear (RAL) estimators. Since the asymptotic variance of a RAL estimator is the variance of its influence function, we can now look for the most efficient estimator by finding the influence function with the smallest variance. If we have two influence functions, say ϕ (X) andϕ 2 (X), we say that Var θ0 [ϕ (X)] <= Var θ0 [ϕ 2 (X)] if and only if for a R q. Var θ0 [a ϕ (X)] <= Var θ0 [a ϕ 2 (X)]

41 4 Or equivalently, a E θ0 [ϕ (X)ϕ (X) ]a<= a E θ0 [ϕ 2 (X)ϕ 2 (X) ]a which means that E θ0 [ϕ 2 (X)ϕ 2 (X) ] E θ0 [ϕ (X)ϕ (X) ] is positive semi-definite.

42 42 Efficient Score and Efficient Influence Function Definition: The efficient score for γ is defined as the residual from the projection of ψ γ (X; θ 0 )ontoλ.thatis, where ψ eff γ (X; θ 0 )=ψ γ (X; θ 0 ) Π[ψ γ (X; θ 0 ) Λ] Π[ψ γ (X; θ 0 ) Λ] = E θ0 [ψ γ (X; θ 0 )ψ λ (X; θ 0 ) ]E θ0 [ψ λ (X; θ 0 )ψ λ (X; θ 0 ) ] ψ λ (X; θ 0 ) By construction ψγ eff (X; θ 0 ) Λ. By normalizing it so that that its inner product with ψ γ (X; θ 0 ) is one, we then define the efficient influence function ϕ eff (X) = E θ0 [ψγ eff = E θ0 [ψγ eff (X; θ 0 )ψ γ (X; θ 0 ) ] ψ eff γ (X; θ 0 ) (X; θ 0 )ψ eff γ (X; θ 0 ) ] ψ eff γ (X; θ 0 ) We are now in a position to prove that ϕ eff (X) isthemost

43 efficient influence function. 43

44 44 Claim: All influence functions can be written as ϕ eff (X)+l(X) where l(x) Λandl(X) {Bϕ eff (X) : forallb}. Since ϕ eff (X) is an influence function, it is clear that Λ {Bϕ eff (X) : forallb}. Now, l(x) Λ {Bϕ eff (X) : forallb} =[Λ {Bϕ eff (X) : forallb}] Proof: If ϕ(x) is an influence function, then ϕ(x) Λ.In addition ϕ eff (X) Λ. Therefore, l(x) =ϕ(x) ϕ eff (X) Λ. Since l(x) Λ, we know that E θ0 [l(x)ψγ eff (X; θ 0 ) ]= E θ0 [l(x)ψ γ (X; θ 0 ) ]=E θ0 [(ϕ(x) ϕ eff (X))ψ γ (X; θ 0 ) ] = 0. Since the space spanned by ψγ eff (X; θ 0 )isequalto {Bϕ eff (X) : forallb}, wehavethat l(x) {Bϕ eff (X) : forallb}.

45 45 From this, it is now easy to show that ϕ eff (X) is the influence function with the smallest variance. Note that E θ0 [ϕ(x)ϕ(x) ] = E θ0 [(ϕ eff (X)+l(X))(ϕ eff (X)+l(X)) ] = E θ0 [ϕ eff (X)ϕ eff (X)] + E θ0 [l(x)l(x) ] Hence, E θ0 [ϕ(x)ϕ(x) ] E θ0 [ϕ eff (X)ϕ eff (X)] is positive semi-definite. This implies that ϕ eff (X) is the most efficient influence function. The variance of the efficient influence function is E θ0 [ψγ eff (X; θ 0 )ψγ eff (X; θ 0 ) ], the inverse of the variance of the efficient score. Then we get the well known result that the minimum variance for the most efficient, regular estimator is [I γγ (θ 0 ) I γλ (θ 0 )I λλ (θ 0 ) I γλ (θ 0 ) ]

Section 9: Generalized method of moments

Section 9: Generalized method of moments 1 Section 9: Generalized method of moments In this section, we revisit unbiased estimating functions to study a more general framework for estimating parameters. Let X n =(X 1,...,X n ), where the X i

More information

Section 8.2. Asymptotic normality

Section 8.2. Asymptotic normality 30 Section 8.2. Asymptotic normality We assume that X n =(X 1,...,X n ), where the X i s are i.i.d. with common density p(x; θ 0 ) P= {p(x; θ) :θ Θ}. We assume that θ 0 is identified in the sense that

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

Chapter 4: Asymptotic Properties of the MLE (Part 2)

Chapter 4: Asymptotic Properties of the MLE (Part 2) Chapter 4: Asymptotic Properties of the MLE (Part 2) Daniel O. Scharfstein 09/24/13 1 / 1 Example Let {(R i, X i ) : i = 1,..., n} be an i.i.d. sample of n random vectors (R, X ). Here R is a response

More information

Chapter 4: Asymptotic Properties of the MLE

Chapter 4: Asymptotic Properties of the MLE Chapter 4: Asymptotic Properties of the MLE Daniel O. Scharfstein 09/19/13 1 / 1 Maximum Likelihood Maximum likelihood is the most powerful tool for estimation. In this part of the course, we will consider

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation Statistics 62: L p spaces, metrics on spaces of probabilites, and connections to estimation Moulinath Banerjee December 6, 2006 L p spaces and Hilbert spaces We first formally define L p spaces. Consider

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

Importance Sampling and. Radon-Nikodym Derivatives. Steven R. Dunbar. Sampling with respect to 2 distributions. Rare Event Simulation

Importance Sampling and. Radon-Nikodym Derivatives. Steven R. Dunbar. Sampling with respect to 2 distributions. Rare Event Simulation 1 / 33 Outline 1 2 3 4 5 2 / 33 More than one way to evaluate a statistic A statistic for X with pdf u(x) is A = E u [F (X)] = F (x)u(x) dx 3 / 33 Suppose v(x) is another probability density such that

More information

ECE 275B Homework # 1 Solutions Winter 2018

ECE 275B Homework # 1 Solutions Winter 2018 ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,

More information

Immerse Metric Space Homework

Immerse Metric Space Homework Immerse Metric Space Homework (Exercises -2). In R n, define d(x, y) = x y +... + x n y n. Show that d is a metric that induces the usual topology. Sketch the basis elements when n = 2. Solution: Steps

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Section 8: Asymptotic Properties of the MLE

Section 8: Asymptotic Properties of the MLE 2 Section 8: Asymptotic Properties of the MLE In this part of the course, we will consider the asymptotic properties of the maximum likelihood estimator. In particular, we will study issues of consistency,

More information

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

10-704: Information Processing and Learning Fall Lecture 24: Dec 7 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 24: Dec 7 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of

More information

Large Sample Properties of Estimators in the Classical Linear Regression Model

Large Sample Properties of Estimators in the Classical Linear Regression Model Large Sample Properties of Estimators in the Classical Linear Regression Model 7 October 004 A. Statement of the classical linear regression model The classical linear regression model can be written in

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

More information

ECE 275B Homework # 1 Solutions Version Winter 2015

ECE 275B Homework # 1 Solutions Version Winter 2015 ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

Local Asymptotic Normality

Local Asymptotic Normality Chapter 8 Local Asymptotic Normality 8.1 LAN and Gaussian shift families N::efficiency.LAN LAN.defn In Chapter 3, pointwise Taylor series expansion gave quadratic approximations to to criterion functions

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit

Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit Convergence of Square Root Ensemble Kalman Filters in the Large Ensemble Limit Evan Kwiatkowski, Jan Mandel University of Colorado Denver December 11, 2014 OUTLINE 2 Data Assimilation Bayesian Estimation

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define HILBERT SPACES AND THE RADON-NIKODYM THEOREM STEVEN P. LALLEY 1. DEFINITIONS Definition 1. A real inner product space is a real vector space V together with a symmetric, bilinear, positive-definite mapping,

More information

Analysis II: The Implicit and Inverse Function Theorems

Analysis II: The Implicit and Inverse Function Theorems Analysis II: The Implicit and Inverse Function Theorems Jesse Ratzkin November 17, 2009 Let f : R n R m be C 1. When is the zero set Z = {x R n : f(x) = 0} the graph of another function? When is Z nicely

More information

Normed & Inner Product Vector Spaces

Normed & Inner Product Vector Spaces Normed & Inner Product Vector Spaces ECE 174 Introduction to Linear & Nonlinear Optimization Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 174 Fall 2016 1 / 27 Normed

More information

Analysis IV : Assignment 3 Solutions John Toth, Winter ,...). In particular for every fixed m N the sequence (u (n)

Analysis IV : Assignment 3 Solutions John Toth, Winter ,...). In particular for every fixed m N the sequence (u (n) Analysis IV : Assignment 3 Solutions John Toth, Winter 203 Exercise (l 2 (Z), 2 ) is a complete and separable Hilbert space. Proof Let {u (n) } n N be a Cauchy sequence. Say u (n) = (..., u 2, (n) u (n),

More information

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE Eric Zivot Winter 013 1 Wald, LR and LM statistics based on generalized method of moments estimation Let 1 be an iid sample

More information

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program

Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools. Joan Llull. Microeconometrics IDEA PhD Program Chapter 1: A Brief Review of Maximum Likelihood, GMM, and Numerical Tools Joan Llull Microeconometrics IDEA PhD Program Maximum Likelihood Chapter 1. A Brief Review of Maximum Likelihood, GMM, and Numerical

More information

Topic 15: Simple Hypotheses

Topic 15: Simple Hypotheses Topic 15: November 10, 2009 In the simplest set-up for a statistical hypothesis, we consider two values θ 0, θ 1 in the parameter space. We write the test as H 0 : θ = θ 0 versus H 1 : θ = θ 1. H 0 is

More information

Chapter 3. Point Estimation. 3.1 Introduction

Chapter 3. Point Estimation. 3.1 Introduction Chapter 3 Point Estimation Let (Ω, A, P θ ), P θ P = {P θ θ Θ}be probability space, X 1, X 2,..., X n : (Ω, A) (IR k, B k ) random variables (X, B X ) sample space γ : Θ IR k measurable function, i.e.

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and Estimation I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 22, 2015

More information

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test. Economics 52 Econometrics Professor N.M. Kiefer LECTURE 1: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING NEYMAN-PEARSON LEMMA: Lesson: Good tests are based on the likelihood ratio. The proof is easy in the

More information

Lecture 21. Hypothesis Testing II

Lecture 21. Hypothesis Testing II Lecture 21. Hypothesis Testing II December 7, 2011 In the previous lecture, we dened a few key concepts of hypothesis testing and introduced the framework for parametric hypothesis testing. In the parametric

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Peter Hoff Minimax estimation October 31, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11 Contents 1 Motivation and definition 1 2 Least favorable prior 3 3 Least favorable prior sequence 11 4 Nonparametric problems 15 5 Minimax and admissibility 18 6 Superefficiency and sparsity 19 Most of

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

Math 209B Homework 2

Math 209B Homework 2 Math 29B Homework 2 Edward Burkard Note: All vector spaces are over the field F = R or C 4.6. Two Compactness Theorems. 4. Point Set Topology Exercise 6 The product of countably many sequentally compact

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 7 Maximum Likelihood Estimation 7. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the

More information

Theoretical Statistics. Lecture 22.

Theoretical Statistics. Lecture 22. Theoretical Statistics. Lecture 22. Peter Bartlett 1. Recall: Asymptotic testing. 2. Quadratic mean differentiability. 3. Local asymptotic normality. [vdv7] 1 Recall: Asymptotic testing Consider the asymptotics

More information

STA 260: Statistics and Probability II

STA 260: Statistics and Probability II Al Nosedal. University of Toronto. Winter 2017 1 Properties of Point Estimators and Methods of Estimation 2 3 If you can t explain it simply, you don t understand it well enough Albert Einstein. Definition

More information

(a) II and III (b) I (c) I and III (d) I and II and III (e) None are true.

(a) II and III (b) I (c) I and III (d) I and II and III (e) None are true. 1 Which of the following statements is always true? I The null space of an m n matrix is a subspace of R m II If the set B = {v 1,, v n } spans a vector space V and dimv = n, then B is a basis for V III

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012 Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 202 BOUNDS AND ASYMPTOTICS FOR FISHER INFORMATION IN THE CENTRAL LIMIT THEOREM

More information

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology Singapore University of Design and Technology Lecture 9: Hypothesis testing, uniformly most powerful tests. The Neyman-Pearson framework Let P be the family of distributions of concern. The Neyman-Pearson

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Local linear approximations Suppose that a particular random sequence converges in distribution to a particular constant. The idea of using a first-order

More information

Integral Jensen inequality

Integral Jensen inequality Integral Jensen inequality Let us consider a convex set R d, and a convex function f : (, + ]. For any x,..., x n and λ,..., λ n with n λ i =, we have () f( n λ ix i ) n λ if(x i ). For a R d, let δ a

More information

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. Adjoint operator and adjoint matrix Given a linear operator L on an inner product space V, the adjoint of L is a transformation

More information

Math 152. Rumbos Fall Solutions to Assignment #12

Math 152. Rumbos Fall Solutions to Assignment #12 Math 52. umbos Fall 2009 Solutions to Assignment #2. Suppose that you observe n iid Bernoulli(p) random variables, denoted by X, X 2,..., X n. Find the LT rejection region for the test of H o : p p o versus

More information

The Lindeberg central limit theorem

The Lindeberg central limit theorem The Lindeberg central limit theorem Jordan Bell jordan.bell@gmail.com Department of Mathematics, University of Toronto May 29, 205 Convergence in distribution We denote by P d the collection of Borel probability

More information

Levinson Durbin Recursions: I

Levinson Durbin Recursions: I Levinson Durbin Recursions: I note: B&D and S&S say Durbin Levinson but Levinson Durbin is more commonly used (Levinson, 1947, and Durbin, 1960, are source articles sometimes just Levinson is used) recursions

More information

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1 Adaptive Filtering and Change Detection Bo Wahlberg (KTH and Fredrik Gustafsson (LiTH Course Organization Lectures and compendium: Theory, Algorithms, Applications, Evaluation Toolbox and manual: Algorithms,

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

THE INVERSE FUNCTION THEOREM

THE INVERSE FUNCTION THEOREM THE INVERSE FUNCTION THEOREM W. PATRICK HOOPER The implicit function theorem is the following result: Theorem 1. Let f be a C 1 function from a neighborhood of a point a R n into R n. Suppose A = Df(a)

More information

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11

Peter Hoff Minimax estimation November 12, Motivation and definition. 2 Least favorable prior 3. 3 Least favorable prior sequence 11 Contents 1 Motivation and definition 1 2 Least favorable prior 3 3 Least favorable prior sequence 11 4 Nonparametric problems 15 5 Minimax and admissibility 18 6 Superefficiency and sparsity 19 Most of

More information

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1, Economics 520 Lecture Note 9: Hypothesis Testing via the Neyman-Pearson Lemma CB 8., 8.3.-8.3.3 Uniformly Most Powerful Tests and the Neyman-Pearson Lemma Let s return to the hypothesis testing problem

More information

Levinson Durbin Recursions: I

Levinson Durbin Recursions: I Levinson Durbin Recursions: I note: B&D and S&S say Durbin Levinson but Levinson Durbin is more commonly used (Levinson, 1947, and Durbin, 1960, are source articles sometimes just Levinson is used) recursions

More information

Solution. 1 Solutions of Homework 1. 2 Homework 2. Sangchul Lee. February 19, Problem 1.2

Solution. 1 Solutions of Homework 1. 2 Homework 2. Sangchul Lee. February 19, Problem 1.2 Solution Sangchul Lee February 19, 2018 1 Solutions of Homework 1 Problem 1.2 Let A and B be nonempty subsets of R + :: {x R : x > 0} which are bounded above. Let us define C = {xy : x A and y B} Show

More information

Math Advanced Calculus II

Math Advanced Calculus II Math 452 - Advanced Calculus II Manifolds and Lagrange Multipliers In this section, we will investigate the structure of critical points of differentiable functions. In practice, one often is trying to

More information

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011 LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS S. G. Bobkov and F. L. Nazarov September 25, 20 Abstract We study large deviations of linear functionals on an isotropic

More information

Projection Theorem 1

Projection Theorem 1 Projection Theorem 1 Cauchy-Schwarz Inequality Lemma. (Cauchy-Schwarz Inequality) For all x, y in an inner product space, [ xy, ] x y. Equality holds if and only if x y or y θ. Proof. If y θ, the inequality

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

INVERSE FUNCTION THEOREM and SURFACES IN R n

INVERSE FUNCTION THEOREM and SURFACES IN R n INVERSE FUNCTION THEOREM and SURFACES IN R n Let f C k (U; R n ), with U R n open. Assume df(a) GL(R n ), where a U. The Inverse Function Theorem says there is an open neighborhood V U of a in R n so that

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Statistics Ph.D. Qualifying Exam

Statistics Ph.D. Qualifying Exam Department of Statistics Carnegie Mellon University May 7 2008 Statistics Ph.D. Qualifying Exam You are not expected to solve all five problems. Complete solutions to few problems will be preferred to

More information

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j. Chapter 9 Pearson s chi-square test 9. Null hypothesis asymptotics Let X, X 2, be independent from a multinomial(, p) distribution, where p is a k-vector with nonnegative entries that sum to one. That

More information

Asymptotic Statistics-III. Changliang Zou

Asymptotic Statistics-III. Changliang Zou Asymptotic Statistics-III Changliang Zou The multivariate central limit theorem Theorem (Multivariate CLT for iid case) Let X i be iid random p-vectors with mean µ and and covariance matrix Σ. Then n (

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 1, Issue 1 2005 Article 3 Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics Moulinath Banerjee Jon A. Wellner

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017.

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017. Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 017 Nadia S. Larsen 17 November 017. 1. Construction of the product measure The purpose of these notes is to prove the main

More information

APPLICATIONS OF DIFFERENTIABILITY IN R n.

APPLICATIONS OF DIFFERENTIABILITY IN R n. APPLICATIONS OF DIFFERENTIABILITY IN R n. MATANIA BEN-ARTZI April 2015 Functions here are defined on a subset T R n and take values in R m, where m can be smaller, equal or greater than n. The (open) ball

More information

REAL AND COMPLEX ANALYSIS

REAL AND COMPLEX ANALYSIS REAL AND COMPLE ANALYSIS Third Edition Walter Rudin Professor of Mathematics University of Wisconsin, Madison Version 1.1 No rights reserved. Any part of this work can be reproduced or transmitted in any

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

2) Let X be a compact space. Prove that the space C(X) of continuous real-valued functions is a complete metric space.

2) Let X be a compact space. Prove that the space C(X) of continuous real-valued functions is a complete metric space. University of Bergen General Functional Analysis Problems with solutions 6 ) Prove that is unique in any normed space. Solution of ) Let us suppose that there are 2 zeros and 2. Then = + 2 = 2 + = 2. 2)

More information

First-Order Logic. 1 Syntax. Domain of Discourse. FO Vocabulary. Terms

First-Order Logic. 1 Syntax. Domain of Discourse. FO Vocabulary. Terms First-Order Logic 1 Syntax Domain of Discourse The domain of discourse for first order logic is FO structures or models. A FO structure contains Relations Functions Constants (functions of arity 0) FO

More information

Lebesgue-Radon-Nikodym Theorem

Lebesgue-Radon-Nikodym Theorem Lebesgue-Radon-Nikodym Theorem Matt Rosenzweig 1 Lebesgue-Radon-Nikodym Theorem In what follows, (, A) will denote a measurable space. We begin with a review of signed measures. 1.1 Signed Measures Definition

More information

Stochastic Convergence, Delta Method & Moment Estimators

Stochastic Convergence, Delta Method & Moment Estimators Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)

More information

Cover Page. The handle holds various files of this Leiden University dissertation

Cover Page. The handle   holds various files of this Leiden University dissertation Cover Page The handle http://hdl.handle.net/1887/32076 holds various files of this Leiden University dissertation Author: Junjiang Liu Title: On p-adic decomposable form inequalities Issue Date: 2015-03-05

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

The Moment Method; Convex Duality; and Large/Medium/Small Deviations

The Moment Method; Convex Duality; and Large/Medium/Small Deviations Stat 928: Statistical Learning Theory Lecture: 5 The Moment Method; Convex Duality; and Large/Medium/Small Deviations Instructor: Sham Kakade The Exponential Inequality and Convex Duality The exponential

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

δ -method and M-estimation

δ -method and M-estimation Econ 2110, fall 2016, Part IVb Asymptotic Theory: δ -method and M-estimation Maximilian Kasy Department of Economics, Harvard University 1 / 40 Example Suppose we estimate the average effect of class size

More information

Lecture 4. f X T, (x t, ) = f X,T (x, t ) f T (t )

Lecture 4. f X T, (x t, ) = f X,T (x, t ) f T (t ) LECURE NOES 21 Lecture 4 7. Sufficient statistics Consider the usual statistical setup: the data is X and the paramter is. o gain information about the parameter we study various functions of the data

More information

CS281A/Stat241A Lecture 17

CS281A/Stat241A Lecture 17 CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian

More information

Lecture 4: September Reminder: convergence of sequences

Lecture 4: September Reminder: convergence of sequences 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 4: September 6 In this lecture we discuss the convergence of random variables. At a high-level, our first few lectures focused

More information

Trace Class Operators and Lidskii s Theorem

Trace Class Operators and Lidskii s Theorem Trace Class Operators and Lidskii s Theorem Tom Phelan Semester 2 2009 1 Introduction The purpose of this paper is to provide the reader with a self-contained derivation of the celebrated Lidskii Trace

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Comparing Variations of the Neyman-Pearson Lemma

Comparing Variations of the Neyman-Pearson Lemma Comparing Variations of the Neyman-Pearson Lemma William Jake Johnson Department of Mathematical Sciences Montana State University April 29, 2016 A writing project submitted in partial fulfillment of the

More information

Minimax Estimation of Kernel Mean Embeddings

Minimax Estimation of Kernel Mean Embeddings Minimax Estimation of Kernel Mean Embeddings Bharath K. Sriperumbudur Department of Statistics Pennsylvania State University Gatsby Computational Neuroscience Unit May 4, 2016 Collaborators Dr. Ilya Tolstikhin

More information