EC2001 Econometrics 1 Dr. Jose Olmo Room D309 J.Olmo@City.ac.uk 1 Revision of Statistical Inference 1.1 Sample, observations, population A sample is a number of observations drawn from a population. Population: either (a real collection of things, such as UK citizens aged 18 to 25, or (b abstract notion such as all possible realizations of... etc. Example.- Possible cases in two tosses of a coin. 1
1.2 Differences between statistics, estimators and parameters A parameter is a property of the population. Not of the sample! Examples.- average wage in UK, life expectancy in Greenland, standard deviation in life expectancy in Greenland... A statistic is a function of observations. Example: sample mean, difference between sample and actual mean, number of people over 90 in a sample of ten individuals of Greenland... An estimator is a particular case of statistic. The difference lies in that an estimator gives us a guess of the value of a parameter of the population that is calculated from the information provided by the available data (sample. Note that an estimator is an statistic but not necessarily the opposite!!! 2
2 Sampling Distributions for the sample mean and sample variance The sample mean is determined by a sample of the population. The population mean is a parameter (constant. We will consider the estimators x n = 1 n n i=1 x i, denoting the sample mean, and Sn 2 = 1 n n 1 i=1 (x i x n 2, denoting the sample variance. Consider an independent and identically distributed (iid sample x 1,..., x n drawn from a N(µ, σ 2 distribution. By the properties of the normal distribution ( n xn µ σ N(0, 1, and (n 1S 2 n σ 2 χ 2 n 1. 3
3 Interval estimation for x n and S 2 n If the distribution of the estimator is known we can compute confidence intervals for the value of the parameter. We need to determine a significance level (α. Goal: Construct a confidence level s.t. the parameter is contained in it 100(1 α% of the times. In interval estimation we construct two functions g 1 (x 1,..., x n and g 2 (x 1,..., x n that depend on the sample observations such that P {g 1 (x 1,..., x n µ g 2 (x 1,..., x n } = 1 α. We operate on the equation P { g 1 > µ > g 2 } = 1 α. We do further computations in the expression to obtain P { ( x n g 1 ( σ > n xn µ σ > ( n xn g 2 } = 1 α. σ 4
Why do we do this??? Let us write it from left to right... P { ( x n g 2 ( σ < n xn µ σ < ( n xn g 1 σ } = 1 α. (1 The statistic ( x n µ σ follows a standard normal distribution. This means N(0, 1. This is our goal; find an standardized statistic. In this case the values of the distribution are tabulated and are in turn known. In fact we know from these tables the critical values z α/2 and z 1 α/2. Example: For α = 0.05, z α/2 = 1.96 and z 1 α/2 = 1.96. P {z α/2 < ( x n µ σ < z1 α/2 } = 1 α. (2 Equating in equations (1 and (2 we have z α/2 = ( x n g 2 σ and z 1 α/2 = ( x n g 1 σ. Then g 2 = x n σ z α/2 and g 1 = x n σ z 1 α/2. 5
Furthermore given the normal distribution is symmetric we have z α/2 = z 1 α/2. Then g 2 = x n σ z α/2 and g 1 = x n + σ z α/2. The confidence interval at a level 1 α is given by [ x n σ z α/2, x n + σ z α/2 ]. Note that z α/2 < 0. Exercise 1.- The confidence interval for σ 2. Hint: P {χ α/2 < (n 1S2 n < χ σ 2 1 α/2 } = 1 α. (1 Solution:... P { (n 1S2 n χ 1 α/2 < σ 2 < (n 1S2 n } = 1 α. χ α/2 (2 6
4 Testing of Hypotheses An estimate gives us information about a parameter provided by the sample at hand. Our desire is to know the value of the parameter. We can use the value of the estimate and the uncertainty of the estimation to test different hypotheses. 1. Point hypothesis: H 0 : µ = 0 vs H 0 : µ 0. 2. Interval hypothesis: H 0 : 1 µ 1 vs H 1 : µ does not belong to [ 1, 1]. A test is a procedure that answers the question of whether the observed difference between the sample value and the population value hypothesized is real or due to uncertainty surrounding the estimation (sample variability. In order to see this we need the distribution function of a statistic relating the estimate and the parameter. 7
In the case of the mean, t = ( x n µ S tν with ν denoting the degrees of freedom of a Student s t-distribution. In this example if the sample has n observations ν = n 1. 4.1 Criteria to reject H 0 The case of hypothesis testing is similar to confidence intervals (sometimes is the same thing. We reject at a significance level (α. This means we assign an α probability to be wrong in our conclusion. This α determines a critical level such that if the statistic exceeds this value we consider the null hypothesis is false. 8
Hypothesis tests can be one sided or twosided: 1. Two-sided: H 0 : µ = 4 vs H 0 : µ 4 2. One-sided: H 0 : µ = 4 vs H 0 : µ < 4 or H 0 : µ = 4 vs H 0 : µ > 4. Example.- (a two sided case H 0 : µ = 4 vs H 0 : µ 4 and consider α = 0.05 and n = 100. Critical level: t 99,1 α/2, given it is a two sided test. Rejection criteria: RH 0 if ( x n 4 S < t99,α/2 or ( n xn 4 S > t99,1 α/2. We can use alternatively the absolute value to write the rejection criteria more compactly, ( x n 4 S > t99,1 α. An alternative viewpoint: measure the probability in the tail (p-values. 9
P { ( x n 4 S > tn 1 } = p value or P {t n 1 > ( x n 4 S } = p value depending on the context. Rejection criteria: p value < α/2. For the absolute value, P {t n 1 > ( x n 4 S } = p value. Rejection criteria: p value < α. Exercise 2.- Do this example for a x n = 4.23 and S = 2.5, b x n = 4.05 and S = 0.1. Does this agree with what your intuition would say? Example.- (a one-sided case H 0 : µ = 4 vs H 0 : µ < 4 and consider α = 0.05 and n = 100. 10
Critical level: t 99,α, given it is a onesided (left test. Rejection criteria (for a left-sided test: < t99,α. ( xn 4 S An alternative viewpoint: measure the probability in the tail (p-values. For the left tail this is } = p value. P {t n 1 ( x n 4 S Rejection criteria: p value < α for a one-sided test. 11
Exercise 3.- Do this example for a x n = 4.5 and S = 0.5, b x n = 3.985 and S = 0.05. Does this agree with what your intuition would say? 4.2 Type of errors in hypothesis testing We can commit two types of errors in hypothesis testing. Type I error: Rejecting the null hypothesis when it is true. Type II error: Accepting the null hypothesis when it is false. This is called β error. Thus α = P {RH 0 H 0 is true}, β = P {AH 0 H 0 is false}. Exercise 4.- Why do you think type I error is also called α error? 12