Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Similar documents
1.010 Uncertainty in Engineering Fall 2008

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Random Variables, Sampling and Estimation

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Stat410 Probability and Statistics II (F16)

Lecture 12: September 27

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Topic 9: Sampling Distributions of Estimators

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Introductory statistics

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

Lecture 19: Convergence

Unbiased Estimation. February 7-12, 2008

Parameter, Statistic and Random Samples

Statistical Theory MT 2008 Problems 1: Solution sketches

Estimation for Complete Data

Asymptotics. Hypothesis Testing UMP. Asymptotic Tests and p-values

Statistical Theory MT 2009 Problems 1: Solution sketches

Chapter 6 Principles of Data Reduction

Questions and Answers on Maximum Likelihood

Exponential Families and Bayesian Inference

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

TAMS24: Notations and Formulas

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Lecture 11 and 12: Basic estimation theory

Lecture 7: Properties of Random Samples

Lecture 9: September 19

LECTURE NOTES 9. 1 Point Estimation. 1.1 The Method of Moments

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Department of Mathematics

Maximum Likelihood Estimation

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Large Sample Theory. Convergence. Central Limit Theorems Asymptotic Distribution Delta Method. Convergence in Probability Convergence in Distribution

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

STATISTICAL INFERENCE

Problem Set 4 Due Oct, 12

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

11 THE GMM ESTIMATION

Stat 421-SP2012 Interval Estimation Section

Efficient GMM LECTURE 12 GMM II

5. Likelihood Ratio Tests

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Solutions: Homework 3

Final Examination Statistics 200C. T. Ferguson June 10, 2010

7.1 Convergence of sequences of random variables

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

6. Sufficient, Complete, and Ancillary Statistics

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

1 Introduction to reducing variance in Monte Carlo simulations

Lecture 2: Monte Carlo Simulation

Last Lecture. Unbiased Test

7.1 Convergence of sequences of random variables

STA 4032 Final Exam Formula Sheet

Expectation and Variance of a random variable

Lecture 6 Ecient estimators. Rao-Cramer bound.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Summary. Recap. Last Lecture. Let W n = W n (X 1,, X n ) = W n (X) be a sequence of estimators for

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Infinite Sequences and Series

Stat 319 Theory of Statistics (2) Exercises

Lecture Chapter 6: Convergence of Random Sequences

MATH 472 / SPRING 2013 ASSIGNMENT 2: DUE FEBRUARY 4 FINALIZED

Summary. Recap ... Last Lecture. Summary. Theorem

LECTURE 8: ASYMPTOTICS I

In this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.

Lecture 23: Minimal sufficiency

Probability and Statistics

Distribution of Random Samples & Limit theorems

Last Lecture. Biostatistics Statistical Inference Lecture 16 Evaluation of Bayes Estimator. Recap - Example. Recap - Bayes Estimator

Properties and Hypothesis Testing

Lecture 33: Bootstrap

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Asymptotic Results for the Linear Regression Model

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

AMS570 Lecture Notes #2

Properties of Point Estimators and Methods of Estimation

MATH/STAT 352: Lecture 15

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Sample questions. 8. Let X denote a continuous random variable with probability density function f(x) = 4x 3 /15 for

4. Understand the key properties of maximum likelihood estimation. 6. Understand how to compute the uncertainty in maximum likelihood estimates.

Statistical inference: example 1. Inferential Statistics

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Mathematical Statistics - MS

Math 312 Lecture Notes One Dimensional Maps

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Last Lecture. Wald Test

Transcription:

Lecture Note 8 Poit Estimators ad Poit Estimatio Methods MIT 14.30 Sprig 2006 Herma Beett Give a parameter with ukow value, the goal of poit estimatio is to use a sample to compute a umber that represets i some sese a good guess for the true value of the parameter. 19 Defiitios 19.1 Parameter A pmf/pdf ca be equivaletly writte as f X (x) or f X (x θ), where θ represets the costats that fully defie the distributio. For example, if X is a ormally distributed RV, the costats µ ad σ will fully defie the distributio. These costats are called parameters ad are geerally deoted by the Greek letter θ. 1 Example 19.1. Normal distributio: f(x θ) f(x µ, σ), 2 parameters: θ 1 = µ ad θ 2 = σ; Biomial distributio: f(x θ) f(x, p), 2 parameters: θ 1 = ad θ 2 = p; Poisso distributio: f(x θ) f(x λ), 1 parameter: θ = λ; Gamma distributio: f(x θ) f(x α, β), 2 parameters: θ 1 = α ad θ 2 = β. 19.2 (Poit) Estimator A (poit) estimator of θ, deoted by θ, ˆ is a statistic (a fuctio of the radom sample): θˆ = r(x 1, X 2,..., X ). (63) Cautio: These otes are ot ecessarily self-explaatory otes. They are to be used as a complemet to (ad ot as a substitute for) the lectures. 1 Aother way of sayig this: parameters are costats that idex a family of distributios. 1

Note that θˆ does ot deped directly o θ, but oly idirectly through the radom process of each X i. A (poit) estimate of θ is a realizatio of the estimator θˆ (i.e.: a fuctio of the realizatio of the radom sample): θˆ = r(x 1, x 2,..., x ). (64) Example 19.2. Assume we have a radom sample X 1,..., X 10 from a ormal distributio N(µ, σ 2 ) ad we wat to have a estimate of the parameter µ (which is ukow). There is a ifiite umber of estimators of µ that we could costruct. I fact, ay fuctio of the radom sample could classify as a estimator of µ, for example: θˆ = r(x 1, X 2,..., X 10 ) = X 10 2 X 10 +X 1 2 1 10 X = 10 i=1 X i 1.5X 2 etc. Example 19.3. Assume a radom sample X 1,..., X from a U[0, θ] distributio, where θ is ukow. Propose 3 differet estimators. 2

20 Evaluatig (Poit) Estimators Sice there are may possible estimators, we eed to defie some characteristic i order to evaluate ad rak them. 20.1 Ubiasedess A estimator θˆ is said to be a ubiased estimator of the parameter θ, if for every possible value of θ: E(θˆ) = θ (65) If θˆ is ot ubiased, it is said to be a biased estimator, where the differece E(θˆ) θ is called the bias of θˆ. If the radom sample X 1,..., X is iid with E(X i ) = θ, the the sample mea estimator θˆ = X = 1 X i (66) i=1 is a ubiased estimator of the populatio mea: E(X ) = θ (see Example 18.1 i Lecture Note 7). If the radom sample X 1,..., X is iid with E(X i ) = µ ad V ar(x i ) = θ, the the sample variace estimator θˆ = S 2 = 1 (X i 1 X ) 2 i=1 (67) is a ubiased estimator of the populatio variace: E(S 2 ) = θ (see Example 18.1 i Lecture Note 7). Example 20.1. Let X i estimator of θ as: U [0, θ]. Assume a radom sample of size ad defie a θˆ = 2 X i. Is θˆ biased? i=1 3

20.2 Efficiecy Let θˆ1 ad θˆ2 be two ubiased estimators of θ. θˆ1 is said to be more efficiet tha θˆ2, if for a give sample size, V ar(θˆ1) < V ar(θˆ2) (68) Where V ar(θˆi) is the variace of the estimator. Let θˆ1 be a ubiased estimator of θ. θˆ1 is said to be efficiet, or miimum variace ubiased estimator, if for ay ubiased estimators of θ, θˆk, V ar(θˆ1) V ar(θˆk) (69) Do ot cofuse the variace of the estimator θˆ, V ar(θˆ), with the sample variace esti mator S 2, which is a ubiased estimator of the populatio variace σ 2 (!). Example 20.2. How would you compare the efficiecy of the estimators i Example 19.2? Which of these estimators is ubiased? 20.3 Mea Squared Error Why restrict ourselves to the class of ubiased estimators? The mea square error (MSE ) specifies for each estimator θˆ a trade off betwee bias ad efficiecy. MSE(θˆ) = E[(θˆ θ) 2 ] = V ar(θˆ) + (bias(θˆ)) 2 (70) θˆ is the miimum mea square error estimator of θ if, amog all possible estimators of θ, it has the smallest MSE for a give sample size. 4

Example 20.3. Graph the pdf of two estimators such that the bias of the first estimator is less of a problem tha iefficiecy (ad vice versa for the other estimator). 20.4 Asymptotic Criteria 20.4.1 Cosistecy p Let θˆ be a estimator of θ. θˆ is is said to be cosistet if θˆ θ (Law of Large Numbers; Lecture Note 7). Example 20.4. Assume a radom sample of size from a populatio f(x), where E(X) = µ (ukow). Which of the followig estimators is cosistet? 5 1 1 1 µˆ1 = X i µˆ2 = X i µˆ3 = X i i=1 5 i=1 5 i=1 MSE 0 as = cosistecy. 20.4.2 Asymptotic Efficiecy Let θˆ1 be a estimator of θ. θˆ1 is is said to asymptotically efficiet if it satisfies the defiitio of a efficiet estimator, equatio (69), whe. 5

21 Poit Estimatio Methods The followig are two stadard methods used to costruct (poit) estimators. 21.1 Method of Momets (MM) Let X 1, X 2,..., X be a radom sample from a populatio with pmf/pdf f(x θ 1,..., θ k ), where θ 1,..., θ k are ukow parameters. Oe way to estimate these parameters is by equatig the first k populatio momets to the correspodig k sample momets. The resultig k estimators are called method of momets (MM) estimators of the parameters θ 1,..., θ k. 2 The procedure is summarized as follows: Populatio momet System of equatios: Sample momet (theoretical) k equatios ad k ukows 1 i=1 X i = E(Xi 1 ) = yields θˆ 1,..., θˆ k Note that i) E(X j i ) = g j (θ 1,..., θ k ) ii) the realizatio of the sample momet is a scalar. 1 1 i=1 X2 i = E(X 2 i ) 1 i=1 X3 i = E(Xi 3 ).. i=1 Xk i = E(X k i ). First momet Secod momet Third momet k th momet 2 This estimatio method was itroduced by Karl Pearso i 1894. 6

Example 21.1. Assume a radom sample of size from a N(µ, σ 2 ) populatio, where µ ad σ 2 are ukow parameters. Fid the MM estimator of both parameters. Example 21.2. Assume a radom sample of size from a gamma distributio: f(x) = 1 x α 1 e x/β for 0 < x <. Assume also that the radom sample realizatio is Γ(α)β α 1 1 2 characterized by i=1 x i = 7.29 ad i=1 x i = 85.59. Fid the MM estimators of α ad β (parameters). Remember that E(X i ) = αβ ad Var(X i ) = αβ 2. 7

21.2 Maximum Likelihood Estimatio (MLE) Let X 1, X 2,..., X be a radom sample from a populatio with pmf/pdf f(x θ 1,..., θ k ), where θ 1,..., θ k are ukow parameters. Aother way to estimate these parameters is fidig the values of θˆ 1,..., θˆ k that maximize the likelihood that the observed sample is geerated by f(x θˆ 1,..., θˆ k). The joit pmf/pdf of the radom sample, f(x 1, x 2,..., x θ 1,..., θ k ), is called the likelihood fuctio, ad it is deoted by L(θ x). L(θ x) = L(θ 1,..., θ k x 1,..., x ) = f(x 1,..., x θ 1,..., θ k ) geerally i=1 f(x i θ 1,..., θ k ) radom sample (iid) Where θ ad x are vectors such that θ = (θ 1,..., θ k ) ad x = (x 1,..., x ). For a give sample vector x, deote θˆmle (x) the parameter value of θ at which L(θ x) attais its maximum. The, θˆmle (x) is the maximum likelihood estimator (MLE) of the (ukow) parameters θ 1,..., θ k. 3 Ituitio: Discrete case. 3 This estimatio method was itroduced by R.A. Fisher i the 1912. 8

Does a global maximum exist? Ca we fid it? Uique?...Back to Calculus 101: foc : L(θ x) θ i = 0, i = 1,..k (for a well behaved fuctio.) You also eed to check that it is really a maximum ad ot a miimum (2 d derivative). May times it is easier to look for the maximum of LL(θ x) (same maximum sice is a mootoic trasformatio...back to Calculus 101). Ivariace property of MLE: ˆτ MLE (θ) = τ(θˆmle ). For large samples MLE yields a excellet estimator of θ (cosistet ad asymptotically efficiet). No woder it is widely used. But... i) Numerical sesitivity (robustess). ii) Not ecessary ubiased. iii) Might be hard to compute. 9

Example 21.3. Assume a radom sample from a N(µ, σ 2 ) populatio, where the parameters are ukow. Fid the ML estimator of both parameters. Example 21.4. Assume a radom sample from a U(0, θ) populatio, where θ is ukow. Fid θˆmle. 10