Week 2: Null hypothesis Aeroplane seat designer wonders how wide to make the plane seats. He assumes population average hip size μ = 43.2cm Sample size n = 50 Question : Is the assumption μ = 43.2cm reasonable? 1. Write down all information given μ = 43.2cm, n = 50, n 1 = 49 + Specify 1 tail / 2 tail H 0 : μ = 43.2cm H A : μ = / 43.2cm 2 tail test 2 tail test because he wants not too wide, not too narrow 3. Determine test statistic AND sampling distribution Pop variance not known t statistics 4. Calculate the test statistic value AND p value Use Eviews to find sample statistics using Histogram Use sample stats to calc t value: P value (EViews) ***p value for 2 tail test = total probability on both tails IF t value is positive(>0) 2 x probability on upper tail IF t value is negative(<0) 2 x probability on lower tail 5. Determine significance level & critical value: α = 0.05 P(t > t 0.025, 49 ) = 0.025 t 0.025, 49 = 2.009, t 0.025, 49 = 2.009 (use t table) MUST write BOTH crit values 6. Define decision rule: 1. Reject H 0 if t value < 2.009 or t value > 2.009 2. Reject H 0 if p value < 0.05 7. Conclude & Answer the question Since 2.010 < 0.5893 < 2.010, H 0 is not rejected. There is not sufficient evidence that the designer s assumption of population average hip size = 43.2cm is not reasonable 7
How wide should the aeroplane seats be? Designer wants to make seats wide enough so that 98% of population can fit. Hip size of 98% population Find the 98% quantile ***Quantile: X Value given the probability of the lower tail 3 assumptions: 1. The population of hip size X is normally distributed (means we can use Z stat) 2. Population mean μ = 43.2 3. Population sd = Sample sd σ = s = 4.584 To find the 98% quantile: 1. Find Z value of 98% quantile P(Z < Z 0 ) = 0.98 Z 0 = 2.054 2. Use the formula of Z value to find X 0. 3. 98% of population can fit into a 52.6cm seat. 8
Interval estimate for population standard deviation How do we know how accurate is Assumption 3: σ = s = 4.584? Finding 95% interval estimate for population sd σ : (n 1)s 2 (n 1)s = Sqrt of (, 2 ) upper χ2 value lower χ2 value *** Lower range limit Use upper χ 2 value, Upper range limit Use lower χ 2 value **MEMORISE: Chisq dist: What is the proportion of population that can fit into a 52.6cm seat, if 4.584? σ = 5.712 instead of Initially, 98% of population can fit into a 52.6cm seat, when σ = 4.584: BUT as σ increases to 5.712, only 95% of population can fit The higher sd, the more data spreads out, the lower the probability of lower tail 9
Testing equality of 2 populations using Independent samples Hypothesis testing steps for comparing 2 populations: 1. Write down all info given + DEFINE all variables + Specify 1 tail/2 tail 3. Determine test statistics 4. Calculate the value of test statistics AND sampling distribution 5. Determine α & critical value 6. Define the decision rule 7. Conclude & Answer the question hypotheses H 0 : H A : (use this) 3. **Test statistics to test the equality of 2 pop means: INDEPENDENT t stat / t stat with pooled var MATCHED PAIRS test 1 pop mean u D with t stat Test statistics to test the equality of 2 pop variances F statistics Test statistics to test whether X is normally distributed Jarque Bera statistics 2 types of test statistics to test 2 pop means for INDEPENDENT samples: 1. UNEQUAL pop variances T stat **df(v): (NO need memorise) S.E. MUST be addition!!! Total variability of both populations 2. EQUAL pop variances T stat with Pooled Variance Pooled variance s 2 P = Weighted average of sample variances 1 & 2 i. s 2 = [(n 1)Sample var 1 + (n P 1 2 1)sample var 2] / df **Degree of freedom(v) in t stats = n 1 + n 2 2 10
Testing the equality of 2 pop means μ 1 and μ 2 using independent samples: Do people who talk on phones while driving have slower reaction? μ 1 = Average reaction time of all phone talking (PT) drivers X 1 = Average reaction time of a sample of phone talking(pt) drivers μ 2 = Average reaction time of all non phone talking (NPT) drivers = Average reaction time of a sample of non phone talking(npt) drivers X 2 1. Write down all info given + DEFINE all variables AND specify 1 tail/2 tail (UPPER) (slower = longer time) 3. Determine test statistics AND sampling distribution Assume pop variances σ 2 are equal T statistics with pooled variance ~ t 268 4. Calculate test statistics values AND p value When H 0 is true = 0 μ 1 μ 2 Degree of freedom = 125 + 145 2 = 268 (EViews) 5. Determine significance level & critical value α = 0.05, t 0.05, 268 = 1.651 6. Define decision rule Reject H 0 if t value > 1.651 OR if p value < 0.05 7. Conclude & Answer the question Since 7.44 > 1.651 and 0.00000000068 < 0.05, H 0 is rejected. There is sufficient evidence that phone talking drivers have slower reaction. 11
F distribution for testing the equality of 2 pop variances F statistics : Derived by comparing the chisq values of 2 pop var, each divided by its own df F stat = Ratio of sample variance 1 & 2 ** Center = 1 (Z stats & t stats = 0) ASSUME independent samples Notation F (prob of upper tail, v1, v2) v 1 = numerator df v 2 = denominator df Question : Is it reasonable to assume that pop var 1 = pop var 2? Testing the equality of 2 pop variances: 1. Write all info given + DEFINE all variables + Specify 1 tail/2 tail (2 tail test) 3. Determine test statistics Test equality of 2 pop variances F statistics 4. Calculate test statistic value Finding P value in F statistics for 2 tail test: If F value > 1 2 x probability on upper tail If F value < 1 2 x probability on lower tail 12
5. *Determine significance level & critical value: α = 0.05 Finding upper critical value: **Use F table that has probability of upper tail F (probability on upper tail, v1, v2) = F (0.025, 124, 144) = 1.41 **Finding lower critical value: ( skewed dist, cannot put ve sign) 2 steps: 1. Write the notation F (probability on upper tail, v1, v2) 2. Do 3 flipping a. Flip the F value 1 / F value b. Flip the probability Probability of lower tail c. Flip degree of freedom v 2, v 1 F (0.975, 124, 144) = 1 / F (0.025, 144,124) = 0.70 6. Define decision rule: Reject H 0 if F value > 1.41 OR F value < 0.70 7. Conclude & Answer the question: Since 0.70 < 0.728 < 1.41, H 0 is not rejected. There is sufficient evidence that the pop variances are equal 13
Testing whether or not data observations are normally distributed 3 Definitions: 1. Variance of a random variable X = 2. Skewness(SK) Tips: Similar to variance, but not squared difference Triple difference + divided by pop sd^3 When random variable X is normally distributed SK = 0 Finding sample skewness: Average of sum of triple difference between X and mean + divided by sample sd 3. Kurtosis(K): The sharpness of the peak of a distribution curve Tips : Quadruple difference + divided by pop sd^4 When random variable X is normally distributed K = 3 Normal dist Finding sample kurtosis: Average of sum of quadruple difference between X and mean + divided by sample sd To test whether random variable X is normally distributed: 1. Set up H 0 H 0 : X is normally distributed 2. Determine test statistics Jarque Bera (JB) statistic: Test whether or not data observations are normally distributed (NO need memorise) 1. Large JB = SK is far from 0 / K is far from 3 X is NOT normally distributed Only reject H 0 if JB > a specific value ALWAYS upper tail test!! 2. **When X is normally distributed JB s χ 2 has df = 2 Only need df & α to find critical value χ 2 (2, α ) 3. Calc test statistic values Sample SK & K, and JB value 4. Determine α = 0.05, critical value: Critical value of JB = 5. Define decision rule Reject H 0 if JB > 5.99 14