Department of Mathematics

Similar documents
Topic 9: Sampling Distributions of Estimators

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

7.1 Convergence of sequences of random variables

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

7.1 Convergence of sequences of random variables

Estimation for Complete Data

Statistics 511 Additional Materials

1.010 Uncertainty in Engineering Fall 2008

Random Variables, Sampling and Estimation

Lecture 19: Convergence

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Chapter 8: Estimating with Confidence

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Simulation. Two Rule For Inverting A Distribution Function

Chapter 6 Principles of Data Reduction

32 estimating the cumulative distribution function

This is an introductory course in Analysis of Variance and Design of Experiments.

Lecture 2: Monte Carlo Simulation


Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

Frequentist Inference

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Kurskod: TAMS11 Provkod: TENB 21 March 2015, 14:00-18:00. English Version (no Swedish Version)

Lecture 11 and 12: Basic estimation theory

Stat 421-SP2012 Interval Estimation Section

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Efficient GMM LECTURE 12 GMM II

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

1 Introduction to reducing variance in Monte Carlo simulations

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Output Analysis and Run-Length Control

The standard deviation of the mean

Data Analysis and Statistical Methods Statistics 651

MATH/STAT 352: Lecture 15

Questions and Answers on Maximum Likelihood

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

6.3 Testing Series With Positive Terms

Math 155 (Lecture 3)

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Expectation and Variance of a random variable

An Introduction to Randomized Algorithms

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

Lecture 12: September 27

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Introductory statistics

TAMS24: Notations and Formulas

6. Sufficient, Complete, and Ancillary Statistics

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

A statistical method to determine sample size to estimate characteristic value of soil parameters

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

Statistical Theory MT 2008 Problems 1: Solution sketches

Lecture 7: Properties of Random Samples

Math 2784 (or 2794W) University of Connecticut

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Properties and Hypothesis Testing

Chapter 6 Sampling Distributions

Basis for simulation techniques

Lecture 9: September 19

STATISTICAL INFERENCE

Stat 319 Theory of Statistics (2) Exercises

Summary. Recap ... Last Lecture. Summary. Theorem

Statistical Theory MT 2009 Problems 1: Solution sketches

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

6 Sample Size Calculations

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Statistical Intervals for a Single Sample

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Topic 10: Introduction to Estimation

Section 14. Simple linear regression.

Mathematical Statistics - MS

Lecture 24 Floods and flood frequency

Probability and statistics: basic terms

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Review Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn

5. Likelihood Ratio Tests

NUMERICAL METHODS COURSEWORK INFORMAL NOTES ON NUMERICAL INTEGRATION COURSEWORK

Linear Regression Demystified

Machine Learning Brett Bernstein

Parameter, Statistic and Random Samples

Transcription:

Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets Let X 1,..., X be idepedet ad idetically distributed with desity f(x; θ 1,..., θ m ). The the k th sample momet is i=1 xk i. The distributio s k th momet is x k f(x; θ 1,..., θ m ) dx. Larse Marx [1]: pp. 293 296 Solvig for the (ˆθ 1,..., ˆθ m ) that equates the first m momets is called the method of momets. x k f(x; ˆθ 1,..., ˆθ m ) dx = i=1 xk i (k = 1,..., m). 19.1.1 Example (Method of momets ad the Gamma distributio) Recall that the Larse Gamma(r, λ) distributio (r > 0, λ > 0) has desity give by f(t) = λr Γ(r) tr 1 e λt (t > 0). The parameter r is the shape parameter, ad λ is the scale parameter. The mea ad variace of a Gamma(r, λ) radom variable are give by E X = r λ, Var X = r λ 2. It is difficult to derive closed form expressios for the MLE of a Gamma, because the gamma fuctio Γ(r) does ot have a closed form expressio. But it is straightforward to derive the method of momets estimators for r ad λ. Usig the fact that for ay radom variable X, we have E(X 2 ) = Var X + (E X) 2 (see Sectio 6.10), give a sample x 1,..., x of idepedet draws from a Gamma, we just eed to solve the two equatios x = i x i = r λ, i x2 i = r λ 2 + ( r λ ) 2 = r(r + 1) λ 2. The solutio is gotte by solvig the first for r = λ x ad substitutig that ito the secod to get i x2 i (λ x)(1 + λ x) = λ 2 2 λ x + λ2 x = λ 2 = x λ + x2 Marx [1]: Example 5.2.5, pp. 294 295 19 1

Ma 3/103 Witer 2017 KC Border Estimatio II 19 2 so ˆλ = i x2 i x, / x2 ad ˆr = ˆλ x. 19.1.2 Example (Method of momets ad the Normal distributio) The Normal(µ, σ 2 ) has mea µ ad variace σ 2, so the method of momets estimators solve ˆµ = x,, i x 2 i / = σ 2 + ˆµ 2. Solvig gives ˆµ = x, σ2 = i x 2 i / x 2 = i (x i x) 2 /. The momets estimators for µ ad σ 2 are the same as the maximum likelihood estimators. 19.2 Other ways to geerate estimators Most other geeral methods for fidig estimators ivolve some sort of maximizatio or miimizatio. For istace, there are miimum χ 2 estimators, that frequetly have ice properties. Mosteller s [3] aalysis of the World Series cosiders miimum χ 2 estimatio i additio to MLE. I ll describe this kid of estimatio later o, whe we discuss χ 2 tests. Most geeral methods for geeratig estimators ivolve choosig a method of either similarity or distace betwee the observed data ad the data that might have bee geerated by the dgp with a give parameter. There are deep reasos why such estimators have good properties, but that s a topic for a more advaced course. 19.3 Digressio: The quatiles z α Statisticias have adopted the followig special otatio. Let Z be a Stadard Normal radom Larse variable, with cumulative distributio fuctio deoted Φ. For 0 < α < 1, defie z α by Marx [1]: p. 307 P (Z > z α ) = α or equivaletly The P (Z z α ) = 1 α. z α = Φ 1 (1 α) This is somethig you ca look up with R or Mathematica s built-i quatile fuctios. (Remember the quatile fuctio is Φ 1.) By symmetry, P (Z < z α ) = α ad P ( Z > z α ) = 2α v. 2017.02.13::17.42 KC Border

Ma 3/103 Witer 2017 KC Border Estimatio II 19 3 so The last iequality is ofte expressed as P ( z α Z z α ) = 1 2α. P ( z α/2 Z z α/2 ) = 1 α. Here are some commoly used values of α ad the correspodig z α to two decimal places. α z α 1 2α 0.1 1.28 0.80 0.05 1.64 0.90 0.025 1.96 0.95 0.01 2.33 0.98 0.005 2.58 0.99 0.5 0.4 0.3 0.2 0.1-3 -2-1 0 1 2 3 This shaded area is the probability of the evet ( Z > 1.96), which is equal to 0.05. Values outside the iterval ( 1.96, 1.96) are ofte regarded as ulikely to have occurred by chace. 19.4 Cofidece itervals for Normal meas if σ is kow So far we have looked at poit estimates, ad barely made a det i the subject. (Erich L. Lehma s classic Theory of Poit Estimatio [2] rus to about 500 pages.) But it is time to move o. Iterval estimates are closely related to hypothesis testig (comig up soo) ad are sometime more useful tha poit estimates. Go back to the Normal estimatio case. The maximum likelihood estimator ˆµ MLE of the mea µ is just the sample mea x = i x i/, but how good is that estimate? If X 1,..., X are idepedet ad idetically distributed N(µ, σ 2 ), the ˆµ MLE = X 1 + + X N(µ, σ 2 /), so by stadardizig ˆµ we have We have just see that ˆµ µ σ/ N(0, 1). z 0.025 = 1.96. KC Border v. 2017.02.13::17.42

Ma 3/103 Witer 2017 KC Border Estimatio II 19 4 Therefore P ( 1.96 ˆµ µ ) σ/ 1.96 = 0.95 But this evet is also equal to the evet ( ˆµ 1.96σ µ So aother way to iterpret this is eve though µ is ot radom. The iterval ) ˆµ + 1.96σ. P ( µ [ˆµ 1.96σ/, ˆµ + 1.96σ/ ] ) = 95% I = [ˆµ 1.96σ/, ˆµ + 1.96σ/ ] is called a 95% cofidece iterval for µ. More geerally we have the followig To get a 1 α cofidece iterval for µ whe σ is kow, set [ I = ˆµ z α/2σ, ˆµ + z ] α/2σ. (1) The P (µ I) = 1 α. 19.4.1 Iterpretig cofidece itervals Remember that µ is ot radom, rather the iterval I(X) = [ˆµ 1.96σ/, ˆµ + 1.96σ/ ] is radom, sice it is based o the radom ˆµ. But oce I calculate I, µ either belogs to I or it does t, so what am I to make of the 95% probability? I thik the way to thik about it is this: No matter what the values of µ ad σ are, followig the procedure draw a sample X from the distributio N(µ, σ 2 ), ad use (1) to calculate the iterval I(X), the iterval I(X) will the have a 95% probability of cotaiig µ. This is ot the same as sayig, I used (1) to calculate the iterval I, so o matter what the values of µ ad σ are, the iterval I has a 95% probability of cotaiig µ. It is the procedure, ot the iterval per se, that gives us the cofidece. Figure 19.1 shows the result of usig this procedure 100 times to costruct a symmetric 95% cofidece iterval for µ, based o (pseudo-)radom samples of size 5 draw from a stadard ormal distributio. Note that i this istace, 5 of the 100 itervals missed the true mea 0. 19.4.2 Hold o But wait! The cofidece iterval give by (1) depeds o σ. What if we do t kow σ? We ca use ˆσ to estimate σ to get a cofidece iterval. The catch is that ˆµ µ ˆσ/ is ot a Stadard Normal radom variable. Istead it has a Studet t distributio. We will discuss this later i Lecture 21, sectios 21.6 ad 21.7. v. 2017.02.13::17.42 KC Border

Ma 3/103 Witer 2017 KC Border Estimatio II 19 5 Figure 19.1. Here are oe hudred 95% cofidece itervals for the mea from a Mote Carlo simulatio of a sample of size 5 idepedet stadard ormals. The itervals that do ot iclude the true mea 0 are show i red. KC Border v. 2017.02.13::17.42

Ma 3/103 Witer 2017 KC Border Estimatio II 19 6 You might ask, whe might I kow σ, but ot kow µ? Maybe i a case like this: I ca imagie the variace i a measuremet of weight usig a balace beam scale depeds o the frictio i the balace bearig. I ca also imagie that the mea measuremet of a sample s mass depeds o the sample s actual mass. I might have a lot of experiece with this particular of scale, so that I kow the variace σ, but the mea of the measuremet depeds o which sample I am weighig. To get a good estimate of the weight, I might make several measuremets, 1 ad I could the use this procedure to geerate a cofidece iterval. (I just made this up, ad it souds plausible, but do ay of you chemists or egieers have ay real iformatio o such scales?) 19.5 Cosideratios i costructig cofidece itervals There are two more poits worth otig. Suppose we kow µ, ad we wat to choose a iterval I so that the stadard ormal radom variable Z = lies i I with probability 1 α. Ay iterval [a, b] satisfyig b a ˆµ µ σ/ 1 2π e z2 /2 dz = 1 α has this property. Because of the symmetry of the ormal distributio, the symmetric iterval [ z α/2σ, z α/2σ ] is the shortest such iterval. [ Because of the properties ] of the stadard ormal distributio, the legth of the iterval ˆµ z α/2σ, ˆµ + z α/2σ does ot deped o µ. For distributios that are ot symmetric, you may wat to costruct asymmetric cofidece itervals. I ca thik of at least two priciples you could use. 1. Choose the shortest iterval [a, b] cotaiig your poit MLE ˆθ that has Pˆθ( [a, b] ) = 1 α. This would be the iterval where the likelihood (= desity) is highest. Sice ˆθ maximizes the likelihood, we kow it will be i the iterval. Oops. How do we kow that a iterval is the shortest set? Maybe we would be better off takig two short itervals istead oe log oe. For uimodal (sigle-peaked) desities, this wo t happe. 2. The other priciple you might cosider is to choose a iterval [a, b] so that P (θ < a) = P (θ > b) = α/2, bearig i mid the above iterpretatio of the probability. I the ormal case, these two priciples are ot i coflict ad procedure for costructig the iterval described above is cosistet with both. Bibliography [1] R. J. Larse ad M. L. Marx. 2012. A itroductio to mathematical statistics ad its applicatios, fifth ed. Bosto: Pretice Hall. [2] E. L. Lehma. 1983. Theory of poit estimatio. Wiley Series i Probability ad Mathematical Statistics. New York: Joh Wiley ad Sos. [3] F. Mosteller. 1952. The world series competitio. Joural of the America Statistical Associatio 47(259):355 380. http://www.jstor.org/stable/2281309 1 My gradfather was a carpeter, so I am quite familiar with the old saw, Measure twice, cut oce. (Sorry, I could t help myself.) v. 2017.02.13::17.42 KC Border