Lecture 19. Condence Interval

Similar documents
Chapter 8 - Statistical intervals for a single sample

MVE055/MSG Lecture 8

Hypothesis Testing One Sample Tests

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

7.1 Basic Properties of Confidence Intervals

Master s Written Examination

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

Answer Key for STAT 200B HW No. 7

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

SOLUTION FOR HOMEWORK 4, STAT 4352

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Pivotal Quantities. Mathematics 47: Lecture 16. Dan Sloughter. Furman University. March 30, 2006

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

ECON 4130 Supplementary Exercises 1-4

STAT 512 sp 2018 Summary Sheet

Economics 240A, Section 3: Short and Long Regression (Ch. 17) and the Multivariate Normal Distribution (Ch. 18)

Lecture 21. Hypothesis Testing II

Chapter 7. Hypothesis Testing

Statistics. Statistics

STA 2201/442 Assignment 2

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Lecture 11. Multivariate Normal theory

Exercises and Answers to Chapter 1

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Probability Theory and Statistics. Peter Jochumzen

Lecture 2: Review of Probability

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

simple if it completely specifies the density of x

Lecture 4: September Reminder: convergence of sequences

The Delta Method and Applications

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

Confidence intervals CE 311S

Introductory Econometrics. Review of statistics (Part II: Inference)

Chapter 9: Interval Estimation and Confidence Sets Lecture 16: Confidence sets and credible sets

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Will Landau. Feb 28, 2013

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

7.3 The Chi-square, F and t-distributions

A Note on Coverage Probability of Confidence Interval for the Difference between Two Normal Variances

Probability. Table of contents

Sampling Distributions

This does not cover everything on the final. Look at the posted practice problems for other topics.

Introducing the Normal Distribution

ANOVA: Analysis of Variance - Part I

2014/2015 Smester II ST5224 Final Exam Solution

Statistical Inference

Notes on the Multivariate Normal and Related Topics

STA510. Solutions to exercises. 3, fall 2012

STAT 4385 Topic 01: Introduction & Review

STA 2101/442 Assignment 3 1

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

Fin285a:Computer Simulations and Risk Assessment Section 2.3.2:Hypothesis testing, and Confidence Intervals

Robustness and Distribution Assumptions

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

MAT3379 (Winter 2016)

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ

Week 1 Quantitative Analysis of Financial Markets Distributions A

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1

Applied Multivariate and Longitudinal Data Analysis

Sampling Distributions

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Comparing two independent samples

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Mathematical statistics

Chapter 3: Maximum Likelihood Theory

TOLERANCE INTERVALS FOR DISCRETE DISTRIBUTIONS IN EXPONENTIAL FAMILIES

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

Lecture 8: Information Theory and Statistics

Lecture 28: Asymptotic confidence sets

7 Estimation. 7.1 Population and Sample (P.91-92)

F & B Approaches to a simple model

Def 1 A population consists of the totality of the observations with which we are concerned.

Math 181B Homework 1 Solution

Economics 583: Econometric Theory I A Primer on Asymptotics

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Stat 710: Mathematical Statistics Lecture 31

II. The Normal Distribution

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Content by Week Week of October 14 27

7 Random samples and sampling distributions

Common ontinuous random variables

Analysis of Experimental Designs

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):

1 Probability theory. 2 Random variables and probability theory.

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Lecture 32: Asymptotic confidence sets and likelihoods

ECON 5350 Class Notes Review of Probability and Distribution Theory

STAT 513 fa 2018 Lec 02

Chapter 9: Hypothesis Testing Sections

Transcription:

Lecture 19. Condence Interval December 5, 2011 The phrase condence interval can refer to a random interval, called an interval estimator, that covers the true value θ 0 of a parameter of interest with a prespecied probability. The prespecied probability (usually something 90%, 95% or 99%) is called the coverage probability or the condence level. The phrase could also refer to a realization of that random interval. Theoretical econometricians usually use the rst meaning whereas applied econometricians usually use the second meaning. In this lecture, we do not explicitly distinguish the two. It is worth noting that when the second meaning is used, it does not make sense to say the true value lies in the condence interval with such and such probability. 1 This is because in classical statistics, θ 0 is not considered random, the realization of a random interval is also nonrandom. The probability that a non-random interval covers a nonrandom true value is either zero or one and cannot be anything in between. Thus, it is more conventional to say that the condence interval of condence level 100(1 α)% is [ 0.01, 0.01]. Construct a condence interval To construct a condence interval, one typically start with an estimator and build an interval around it. The idea is that, if the estimator is good then it is likely to be close to the true value. Thus an interval around the estimator is likely to cover the true value. Example 1. Suppose we would like to construct a condence interval for the expection, µ, of X using a random n-sample {X 1,..., X n }. We start with an estimator, say the sample mean X n and construct an interval of the shape: [ X n c n, X n + c n ], for some positive (possibly random) c n. The c n is chosen to make the interval have a pre- 1 For example, it does not make sense to say that θ 0 lies in [ 0.01, 0.01] with 100(1 α)% probability. 1

specied coverage probability either exactly: Pr µ ( X n c n µ X n + c n ) = 1 α; (1) or asymptotically: lim Pr µ( X n c n µ X n + c n ) = 1 α. (2) n Case 1. X N(µ, σ 2 ) with known σ 2. Then, Xn N(µ, σ 2 /n), and hence n( X n µ)/σ N(0, 1). Using this fact, we can determine c n as follows: Pr µ ( X n c n µ X n + c n ) = Pr µ ( X n µ c n ) = Pr µ ( n( X n µ)/σ nc n /σ) = 1 2(1 Φ( nc n /σ)) = 2Φ( nc n /σ) 1. (3) The second to the last step uses the fact that the density of standard normal is symmetric around zero, thus Φ(x) = 1 Φ( x). Set the coverage probability to 1 α, and we have Φ( nc n /σ) = 1 α/2. (4) Use z α to denote the (1 α) quantile of N(0, 1). Then the appropriate c n should be σz α/2 / n. And the condence interval for µ is [ X n σz α/2 / n, X n + σz α/2 / n]. (5) Remark. This is a symmetric condence interval in that it is symmetric around the estimator X n. It is also an equal-tailed condence interval in that the probability that the whole interval falls on one side of µ equals the probability that it falls on the other side of µ: Pr µ ( X n σz α/2 / n µ) = Pr µ ( X n + σz α/2 / n µ) = α/2. (6) The symmetric condence interval and the equal-tailed condence interval coincide in this case because X n (the estimator)'s distribution is symmetric around µ (the true value) Why does one want to restrict the condence interval to be symmetric or equal-tailed? Why not consider condence intervals of the shape, for example, [ X n c n, X n + 10c n ]? It is as easy to nd c n that makes this condence interval have the prespecied coverage probability as above. But the problem is that this interval will be longer than the symmetric one. (c n will be smaller than σz α/2 / n, but 10c n will be much bigger than σz α/2 / n. ) 2

Longer interval estimators are considered less informative typically. There is one special case, though, which is the one-sided condence interval. The one sided condence interval is of the shape (, X n +c n ] or [ X n c n, ). These intervals are of innite length, but they are frequently used when one side of the parameter space (space of µ) is more interesting than the other. For example, when it is of interest to conclude we have 95% condence that µ is above (some number), one can use [ X n c n, ) instead of the two-sided intervals. At the same condence level, this interval gives you the largest lower bound number that you can insert in your conclusion. (Exercise: nd c n for the condence interval [ X n c n, X n + 10c n ] and for the condence interval [ X n c n, ) to have 95% coverage probability.) Case 1. In the above case, we assume that σ is known. If σ is not known, then the condence interval above is not feasible. In order to nd c n, in the second step of (3), we should not divide by σ. Instead, we can divide by the sample variance S X. Then Pr µ ( X n c n µ X n + c n ) = Pr µ ( n X n µ /S X nc n /S X ). (7) The random variable n( X n µ)/s X is not N(0, 1), but it has the student t-distribution, which is shown next. As above we know n( X n µ)/σ N(0, 1). The random variable (n 1)SX 2 /σ2 χ 2 (1) because: (n 1)S 2 X/σ 2 = n (X i X n ) 2 /σ 2 i=1 σ 1 (X 1 µ)... σ 1 (X n µ) / (I n n 1 1 n 1 n) σ 1 (X 1 µ)... σ 1 (X n µ), where I n is a n-dimensional identity matrix, 1 n is a n-vector of ones. The matrix I n n 1 1 n 1 n σ 1 (X 1 µ) is idempotent, and the vector... N(0, I n ). The quadractic form of a σ 1 (X n µ) standard normal vector with an idempotent weight matrix follows a chi-squared distribution 3

of degree equal to the trace of the weight matrix. Thus, (n 1)S 2 X/σ 2 χ 2 (trace(i n n 1 1 n 1 n)) = χ 2 (trace(i n ) n 1 trace(1 n 1 n)) = χ 2 (trace(i n ) n 1 trace(1 n1 n )) = χ 2 (n 1). (8) It can also be shown that n( X n µ)/σ is independent of (n 1)S 2 X /σ2. Therefore, n( Xn µ)/s X = = n( Xn µ)/σ ((n 1)S 2 X /σ 2 )/n 1 Z W/n 1, where Z N(0, 1) and W χ 2 (n 1) and Z is independent of W. By direct calculation Z of the density of, one can show that it is the density for the student t distribution W/n 1 with degree of freedom n 1: t(n 1). Thus, setting (7) to 1 α gives: 1 α = Pr( t(n 1) nc n /S X ) = 2F t(n 1) ( nc n /S X ) 1. Thus, c n = t α/2,n 1 S X / n, where t α/2,n 1 is the 1 α/2 quantile of t(n 1). The condence interval for µ is [ X n t α/2,n 1 S X / n, X n + t α/2,n 1 S X / n]. Case 2. The condence intervals in the above two cases are exact condence intervals, i.e. their coverage probabilities equals the prespecied condence level 1 α (called nominal condence level) exactly for any sample size. This is possible only because the estimator (and its sample standard deviation)'s distribution is known and tractable, which is so because we assume the X has normal distribution. Without the normality assumption, the distributions of Xn and S X are either unknown or in principal known but very untractable. Then, we will need to use condence intervals that do not have exact coverage probability (1), but have asymptotic coverage probability (2). We have shown that (suppose E[X 2 ] < and V ar(x) > 0) n( Xn µ) d N(0, σ 2 ) and S X p σ. By the continuous mappint theorem: n( X n µ)/s X d N(0, 1). If we set nc n /S X to a 4

xed number c, then lim Pr µ( X n c n µ X n + c n ) = lim Pr µ ( n( X n µ)/s X c) n n =2Φ(c) 1. Setting it to 1 α gives c = z α/2. Therefore, we can set c n = z α/2 S X / n and the random interval [ X n c n, X n + c n ] has coverage probability 1 α approximately in large samples. Example 2. The large sample condence interval can be constructed for any θ that has a n-consistent and asymptotically normal estimator ˆθn in the same way as Case 3 above. Suppose that n(ˆθ n θ) d N(0, σ 2 ), σ 2 > 0 and there is a consistent estimator ˆσ n for σ. Then, a two-sided symmetric condence interval for θ of nominal condence level 1 α can be [ˆθ n z α/2ˆσ n / n, ˆθ n + z α/2ˆσ n / n]. (Exercise, nd the one-sided condence interval of nominal condence level 1 α for θ and show that it asymptotically has coverage probability 1 α.) Example 3 (Condence interval for Variance). Condence intervals do not have to be of the shape as above. Suppose X N(µ, σ 2 ) with unknown σ 2. Find a condence interval for σ 2. Suppose that the interval is [c 1n, c 2n ]. Then we want it to satisfy: Pr(c 1n σ 2 c 2n ) = 1 α. In Case 2 above, we have found that (n 1)S 2 X /σ2 χ 2 (n 1). This suggests that we can do the following manipulation to the above equation: 1 α = Pr(c 1n σ 2 c 2n ) = Pr(c 1n /((n 1)SX) 2 σ 2 /((n 1)SX) 2 c 2n /((n 1)SX)) 2 = Pr((n 1)SX/c 2 2n (n 1)SX/σ 2 2 (n 1)SX/c 2 1n ) = F χ 2 (n 1)((n 1)SX/c 2 1n ) F χ 2 (n 1)((n 1)SX/c 2 2n ). Because F χ 2 (n 1)(), SX 2 and n are known, we can choose c 1n and c 2n to make the last expression 1 α. Clearly, there are many pairs (c 1n, c 2n ) that can satisfy the coverage probability requirement. Which ones to use depends on what one is interested in. If one has no preference, a good choice is the equal-tailed condence interval, which makes Pr(c 2n σ 2 ) = α/2 = Pr(c 1n > σ 2 ) 5

or equivalently: 1 F χ 2 (n 1)((n 1)SX/c 2 1n ) = F χ 2 (n 1)((n 1)SX/c 2 2n ) = α/2, or (n 1)SX 2 /c 1n = χ 2 α/2,n 1 and (n 1)S2 X /c 2n = χ 2 1 α/2,n 1, where χ2 α,n 1 is the 1 α quantile of χ 2 (n 1). Consequently, c 1n = (n 1)SX 2 /χ2 α/2,n 1 and c 2n = (n 1)SX 2 /χ2 1 α/2,n 1. (Think: how to construct (large sample) condence intervals for σ 2 without the normality assumption on X? Hint: we have derived the asymptotic distribution of SX 2 in a previous lecture.) 6