Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Similar documents
Topic 9: Sampling Distributions of Estimators

Stat 421-SP2012 Interval Estimation Section

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Statistical Theory MT 2008 Problems 1: Solution sketches

Statistical Theory MT 2009 Problems 1: Solution sketches

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

6.3 Testing Series With Positive Terms

4x 2. (n+1) x 3 n+1. = lim. 4x 2 n+1 n3 n. n 4x 2 = lim = 3

Convergence of random variables. (telegram style notes) P.J.C. Spreij

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Lecture 2: Monte Carlo Simulation

Estimation for Complete Data

Seunghee Ye Ma 8: Week 5 Oct 28

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

1 Convergence in Probability and the Weak Law of Large Numbers

Lecture 11 and 12: Basic estimation theory

Random Variables, Sampling and Estimation

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

1 Approximating Integrals using Taylor Polynomials

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Lecture 19: Convergence

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

7.1 Convergence of sequences of random variables

Section 14. Simple linear regression.

Exponential Families and Bayesian Inference

7.1 Convergence of sequences of random variables

Department of Mathematics

Lecture 6 Ecient estimators. Rao-Cramer bound.

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

1.010 Uncertainty in Engineering Fall 2008

An Introduction to Asymptotic Theory

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

MLE and efficiency 23. P (X = x) = θx Let s try to find the MLE for θ. A random sample drawn from this distribution has the likelihood function

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

Chapter 2 The Monte Carlo Method

Chapter 6 Principles of Data Reduction

Questions and Answers on Maximum Likelihood

This section is optional.

Math 10A final exam, December 16, 2016

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Introductory statistics

LECTURE 8: ASYMPTOTICS I

Fall 2013 MTH431/531 Real analysis Section Notes

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Sequences and Series of Functions

Summary. Recap ... Last Lecture. Summary. Theorem

MATH/STAT 352: Lecture 15

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Maximum Likelihood Estimation and Complexity Regularization

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Lecture 7: Properties of Random Samples

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Efficient GMM LECTURE 12 GMM II

Notes 19 : Martingale CLT

Math 113 Exam 3 Practice

Infinite Sequences and Series

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Unbiased Estimation. February 7-12, 2008

Mathematical Statistics - MS

STAT Homework 1 - Solutions

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

MATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n

HOMEWORK #10 SOLUTIONS

Distribution of Random Samples & Limit theorems

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

32 estimating the cumulative distribution function

STA Object Data Analysis - A List of Projects. January 18, 2018

Lecture 12: September 27

Chapter 10: Power Series

Solutions: Homework 3

Carleton College, Winter 2017 Math 121, Practice Final Prof. Jones. Note: the exam will have a section of true-false questions, like the one below.

5. Likelihood Ratio Tests

1 Review and Overview

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Math 113, Calculus II Winter 2007 Final Exam Solutions

16 Riemann Sums and Integrals

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Mathematics 170B Selected HW Solutions.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Problem Set 4 Due Oct, 12

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

Transcription:

Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0, ] = 1 I(max(X 1,..., X. Here the idicator fuctio I(A equals to 1 if A happes ad 0 otherwise. What we wrote is that the product of p.d.f. f(x i will be equal to 0 if at least oe of the factors is 0 ad this will happe if at least oe of X i s will fall outside of the iterval [0, ] which is the same as the maximum amog them exceeds. I other words, ad ϕ( = 0 if < max(x 1,..., X, ϕ( = 1 if max(x 1,..., X. Therefore, lookig at the figure 5.1 we see that ˆ = max(x 1,..., X is the MLE. 5.1 Cosistecy of MLE. Why the MLE ˆ coverges to the ukow parameter 0? This is ot immediately obvious ad i this sectio we will give a sketch of why this happes. 17

LECTURE 5. 18 ϕ( max(x1,..., X Figure 5.1: Maximize over First of all, MLE ˆ is a maximizer of L = 1 log f(x i which is just a log-likelihood fuctio ormalized by 1 (of course, this does ot affect the maximizatio. L ( depeds o data. Let us cosider a fuctio l(x = log f(x ad defie L( = 0 l(x, where we recall that 0 is the true ukow parameter of the sample X 1,..., X. By the law of large umbers, for ay, L ( 0 l(x = L(. Note that L( does ot deped o the sample, it oly depeds o. We will eed the followig Lemma. We have, for ay, L( L( 0. Moreover, the iequality is strict L( < L( 0 uless which meas that = 0. 0 (f(x = f(x 0 = 1.

LECTURE 5. 19 Proof. Let us cosider the differece L( L( 0 = 0 (log f(x log f(x 0 = 0 log f(x f(x 0. t 1 log t t 0 1 Figure 5.2: Diagram (t 1 vs. log t Sice (t 1 is a upper boud o log t (see figure 5.2 we ca write 0 log f(x f(x 0 = ( f(x ( f(x 0 f(x 0 1 = f(x 0 1 f(x 0 dx f(x dx f(x 0 dx = 1 1 = 0. Both itegrals are equal to 1 because we are itegratig the probability desity fuctios. This proves that L( L( 0 0. The secod statemet of Lemma is also clear. We will use this Lemma to sketch the cosistecy of the MLE. Theorem: Uder some regularity coditios o the family of distributios, MLE ˆ is cosistet, i.e. ˆ 0 as. The statemet of this Theorem is ot very precise but but rather tha provig a rigorous mathematical statemet our goal here to illustrate the mai idea. Mathematically iclied studets are welcome to come up with some precise statemet. Proof.

LECTURE 5. 20 We have the followig facts: 1. ˆ is the maximizer of L ( (by defiitio. 2. 0 is the maximizer of L( (by Lemma. 3. we have L ( L( by LLN. This situatio is illustrated i figure 5.3. Therefore, sice two fuctios L ad L are gettig closer, the poits of maximum should also get closer which exactly meas that ˆ 0. L( L( ^ MLE 0 Figure 5.3: Lemma: L( L( 0 5.2 Asymptotic ormality of MLE. Fisher iformatio. We wat to show the asymptotic ormality of MLE, i.e. that (ˆ 0 d N(0, σ 2 MLE for some σ2 MLE. Let us recall that above we defied the fuctio l(x = log f(x. To simplify the otatios we will deote by l (X, l (X, etc. the derivatives of l(x with respect to.

LECTURE 5. 21 Defiitio. (Fisher iformatio. Fisher Iformatio of a radom variable X with distributio 0 from the family { : Θ} is defied by I( 0 = 0 (l (X 0 2 0 ( log f(x 0 2. Next lemma gives aother ofte coveiet way to compute Fisher iformatio. Lemma. We have, ad Proof. First of all, we have 0 l (X 0 0 2 Also, sice p.d.f. itegrates to 1, 2 log f(x 0 = I( 0. l (X = (log f(x = f (X f(x (log f(x = f (X f(x (f (X 2 f 2 (X. f(x dx = 1, if we take derivatives of this equatio with respect to (ad iterchage derivative ad itegral, which ca usually be doe we will get, f(x dx = 0 ad 2 f(x dx = 2 f (x dx = 0. To fiish the proof we write the followig computatio 0 l 2 (X 0 = 0 log f(x 0 = (log f(x 2 0 f(x 0 dx (f (x 0 ( f = f(x 0 (x 0 2 f(x 0 dx f(x 0 = f (x 0 dx 0 (l (X 0 2 = 0 I( 0 = I( 0. We are ow ready to prove the mai result of this sectio.

LECTURE 5. 22 Theorem. (Asymptotic ormality of MLE. We have, 1 (ˆ 0 N( 0,. I( 0 Proof. Sice MLE ˆ is maximizer of L ( = 1 log f(x i we have, Let us use the Mea Value Theorem f(a f(b a b L (ˆ = 0. = f (c or f(a = f(b + f (c(a b for c [a, b] with f( = L (, a = ˆ ad b = 0. The we ca write, 0 = L (ˆ = L ( 0 + L (ˆ 1 (ˆ 0 for some ˆ 1 [ˆ, 0 ]. From here we get that ˆ 0 = L ( 0 L (ˆ 1 ad L (ˆ 0 = ( 0 L (ˆ. (5.1 1 Sice by Lemma i the previous sectio 0 is the maximizer of L(, we have Therefore, the umerator i (5.1 L ( 0 = ( 1 = ( 1 L ( 0 = 0 l (X 0 = 0. (5.2 l (X i 0 0 ( l (X i 0 0 l (X 1 0 N 0, Var 0 (l (X 1 0 (5.3 coverges i distributio by Cetral Limit Theorem. Next, let us cosider the deomiator i (5.1. First of all, we have that for all, L ( = 1 l (X i 0 l (X 1 by LLN. (5.4 Also, sice ˆ 1 [ˆ, 0 ] ad by cosistecy result of previous sectio ˆ 0, we have ˆ 1 0. Usig this together with (5.4 we get L (ˆ 1 0 l (X 1 0 = I( 0 by Lemma above.

LECTURE 5. 23 Combiig this with (5.3 we get L ( 0 L (ˆ 1 Fially, the variace, ( N 0, Var 0 (l (X 1 0. (I( 0 2 Var 0 (l (X 1 0 = 0 (l (X 0 2 ( 0 l (x 0 2 = I( 0 0 where i the last equality we used the defiitio of Fisher iformatio ad (5.2.