Hypothesis Tests for One Population Mean

Hypthesis Tests fr One Ppulatin Mean Chapter 9 Ala Abdelbaki

Objective Objective: T estimate the value f ne ppulatin mean Inferential statistics using statistics in rder t estimate parameters We will be wrking with tw inference prcedures: (1) Z-Tests (2) T-Tests

The Basic Idea Accrding t lng-standing research, the average life-span f all green sea turtles is 80 years. Suppse yu are cnducting new research n green sea turtles. If yu are lking fr a breakthrugh finding, ideally, what wuld yu like t happen? Claim: Average life-span is 80 years Yur Cunter-claim: Average life-span is nt 80 years Nte: In hypthesis testing, either (1) the claim is true, r (2) the cunter-claim is true!

The Basic Idea (cntinued) The claim: Null Hypthesis The cunter-claim: Alternative Hypthesis H 0 : μ = 80 years (claim r status qu) H a : μ < 80 r H a : μ > 80 r H a : μ 80 (suspicins) There are tw pssibilities: Reject H 0 in favr f H a (i.e. claim is nt true) Fail t reject H 0, r accept H 0 (i.e. claim is true)

Example: Z-Test Fr years, the cmmissiner f the NBA has claimed that the average GPA is 3.0. We suspect the average GPA is actually less than 3.0 (we usually want this t be true). T test this claim, let s take a SRS f 64 players. We btain a sample mean GPA f 2.0. Assume a ppulatin standard deviatin f 1.0. Using ur sample mean, let s test the cmmissiner s claim at the α = 0.05 level f significance.

Example: Z-Test (cntinued) Null Hypthesis: H 0 : μ = 3.0 (claim r status qu) Alternative Hypthesis: H a : μ < 3.0 (ur suspicins) There are nly tw pssible slutins: If the sample mean is significantly less than 3.0, then reject the null hypthesis If the sample mean is greater than r clse enugh t 3.0, then fail t reject the null hypthesis In ur example, will we be inclined t reject r fail t reject the null hypthesis? (let s make an educated guess!)

Example: Z-Test (cntinued) We need t standardize the fllwing and cmpare: (1) the established mean (i.e. the claim) (2) the sample mean (1) We treat 3.0 as a hypthesized ppulatin mean, s let the assciated z = 0 (always assume the claim t be true) (2) T standardize the sample mean (called test statistic): z = x μ σ n = 2. 0 3. 0 1 64 = 8

Example: Z-Test (cntinued) This indicates that the sample mean lies 8 standard deviatins away frm the claim!! S, we will be inclined t reject the null hypthesis, in favr f the alternative hypthesis. T cmplete the prblem, we cmpute the prbability f btaining z = -8 r z -8. This value is called the p-value. If p is small (i.e. p α), we reject H 0 If p is large (i.e. p > α), we fail t reject H 0

Example: Example: Z-Test Nn-pled (cntinued) t-interval (cntinued) A [1] natural We first fllw-up cnduct questin a Z-Test: is, hw much imprvement was made in reading ability frm these directed activities? Nte that the hypthesis test nly tells us whether imprvement Stat > Tests in reading > Z-Test was > Stats made r nt, but it desn t infrm the research n hw much imprvement [2] was After made. entering T answer the statistics, this questin, highlight we the calculate fllwing: a 95% cnfidence interval. μ 0 : 3.0; σ = 1; x = 2. 0; n = 64; μ: < μ 0 Cnclusin: We are 95% cnfident that the mean difference in reading skills between the treatment and cntrl [3] grups Highlight is between Calculate 10.42 and and press 29.01 Enter units. The margin f errr is ±9.23.

Example: Z-Test (cntinued) H 0 : μ = 3.0 (status qu) H a : μ < 3.0 (ur suspicins) Assumptins: (1) SRS; (2) Nrmal ppulatin r large sample; (3) Ppulatin standard deviatin is given Test statistic: z = 8.0 p = 0 Cnclusin: Since p 0.05, reject the null hypthesis in favr f the alternative hypthesis! At the 5% level f significance, the data prvide sufficient evidence t cnclude that mean high schl GPA amng all NBA players is actually less than 3.0.

Example: Critical-Value (cntinued) H 0 : μ = 3.0 (status qu) H a : μ < 3.0 (ur suspicins) Critical-Value Apprach: Using either a z-table r the invnrm(0.05) cmmand n yur calculatr, the critical-value is 1.645 (since this is a left-tailed test). Definitin: The critical-value separates the rejectin regin frm the nn-rejectin regin. Cnclusin: Since the test statistic z = 8.0 lies in the rejectin regin, we reject the null hypthesis in favr f the alternative hypthesis!

Example: Critical-Value (cntinued) Steps t hypthesis testing Frmulate the tw hyptheses, H 0 versus H a Calculate the test statistic and the relevant p-value Cmpare the p-value with the stated significance level, (r cmpare the test statistic t the critical-value) State yur cnclusin in cntext

Sme Preliminary Remarks The researcher usually wants t reject the null hypthesis H 0 in favr f the alternative hypthesis H a There are tw appraches t hypthesis testing: (1) p- value apprach r (2) the critical-value apprach If p is small (i.e. p α), we reject H 0 If p is large (i.e. p > α), we fail t reject H 0 In the critical-value apprach: If the test statistic lies in the rejectin regin, then reject H 0 If the test statistic des nt lie in the rejectin regin, then fail t reject the H 0

Sme Definitins

Null and Alternative Hyptheses

Example: Z-Test A cmpany that prduces snack fds uses a machine t package 454 g bags f pretzels. We assume that the net weights are nrmally distributed and that the ppulatin standard deviatin f all such weights is 7.8 g. A simple randm sample f 25 bags f pretzels has the net weights, in grams, displayed in the table belw. D the data prvide sufficient evidence t cnclude that the packaging machine is nt wrking prperly? Use alpha equal t 0.05

Example: Z-Test (cntinued) H 0 : μ = 454 g (machine is wrking prperly) H a : μ 454 g (machine is nt wrking prperly) Assumptins: The three assumptins are satisfied (what are they?) Test statistic: (This is the standardized sample mean) z = x μ σ n = 450 454 7. 8 25 = 2. 56 p = 0.0103 Cnclusin: Since p 0.05, reject the null hypthesis in favr f the alternative hypthesis! At the 5% level f significance, the data prvide sufficient evidence t cnclude that the machine is NOT wrking prperly.

Example: Critical-Value (slutin) H 0 : μ = 454 g (machine is wrking prperly) H a : μ 454 g (machine is nt wrking prperly) Critical-Value Apprach: Using either a z-table r the invnrm(0.025) cmmand n yur calculatr, the criticalvalues are 1.96 and 1.96 (this is a tw-tailed test). Cnclusin: Since the test statistic z = 2.56 lies in the rejectin regin, we reject the null hypthesis in favr f the alternative hypthesis! At the 5% level f significance, the data prvide sufficient evidence t cnclude that the machine is nt wrking prperly.

Example: Critical-Value (slutin)

Example: Z-Test The U.S. department f agriculture reprts that the mean cst f raising a child frm birth t age 2 in a rural area is $8,390. Yu believe that it csts mre than $8,390 t raise a child. Yu select a randm sample f 900 children (each with age 2) and find the mean cst is $8,425. The ppulatin standard deviatin is $1,540. At the 0.10 level f significance, is there enugh evidence t cnclude that the mean cst is different frm $8,390?

Example: Z-Test (cntinued) H 0 : μ = $8,390 (claim) H a : μ > $8,390 (ur suspicins) Assumptins: The 3 assumptins are satisfied (what are they?) Test statistic: (This is the standardized sample mean) z = x μ σ n = 8, 425 8, 390 1, 540 900 = 0. 682 p = 0.248 Cnclusin: Since p > 0.10, fail t reject the null hypthesis! At the 10% level f significance, we d nt have enugh evidence t reject the department s claim that the mean cst fr raising a child frm birth t age 2 is $8,390.

Example: Critical-Value (slutin) H 0 : μ = $8,390 (claim) H a : μ > $8,390 (ur suspicins) Critical-Value Apprach: Using either a z-table r the invnrm(0.10) cmmand n yur calculatr, the critical-value is 1.28 (this is a ne-tailed test). Cnclusin: Since the test statistic z = 0.682 des NOT lie in the rejectin regin, we fail t reject the null hypthesis! At the 10% level f significance, we d nt have enugh evidence t reject the department s claim that the mean cst fr raising a child frm birth t age 2 is $8,390.

Example: Z-Test Suppse a battery manufacturer claims that the mean life f its AA batteries is 300 minutes. Yu suspect that this claim is nt valid, and that the batteries d nt actually last this lng. T test the claim, a researcher takes a simple randm sample f n = 100 batteries and measure the life f each. Suppse we btain a sample mean f 294 minutes. Assume that the ppulatin standard deviatin is 20 minutes. Des this indicate that the manufacturer s claim is wrng? Use a 0.05 level f significance.

Example: Z-Test (cntinued) 1. Which hyptheses are used t establish the claim? H 0 : μ = 294; H a : μ < 294 H 0 : μ = 300; H a : μ < 294 H 0 : μ = 300; H a : μ < 300 H 0 : μ = 300; H a : μ 294 2. In this prblem μ represents the mean life f all AA batteries the sample mean life f the 300 AA batteries the hypthetical value μ 0 parts (a) and (c)

Example: Z-Test (cntinued) 3. In this prblem 300 is the numerical value f a parameter a statistic bth a parameter and a statistic neither a parameter nr a statistic 4. In this prblem 300 is the numerical value f a hypthetical value μ 0 the ppulatin mean the sample mean f a set f batteries nne f the abve

Example: Z-Test (slutin) H 0 : μ = 300 (status qu) H a : μ < 300 (ur suspicins) Assumptins: (1) SRS; (2) Nrmal ppulatin r large sample; (3) Ppulatin standard deviatin is given Test statistic: z = 3.0 p = 0.0013 Cnclusin: Since p 0.05, reject the null hypthesis in favr f the alternative hypthesis! At the 5% level f significance, the data prvide sufficient evidence t cnclude that all AA batteries last less than 300 minutes lng, n average.

Example: Critical-Value (slutin) H 0 : μ = 300 (status qu) H a : μ < 300 (ur suspicins) Critical-Value Apprach: Using either a z-table r the invnrm(0.05) cmmand n yur calculatr, the critical-value is -1.645 (this is a left-tailed test). Cnclusin: Since the test statistic z = -3.0 lies in the rejectin regin, we reject the null hypthesis! At the 5% level f significance, the data prvide sufficient evidence t cnclude that all AA batteries last less than 300 minutes lng, n average.

Example: T-test An industrial cmpany claims that the mean ph level f water in NOVA Ludun s pnd is 6.8. Hwever, we suspect that the mean ph level is greater than 6.8. T test this claim, yu randmly select 36 water samples frm the pnd and measure the ph f each. The sample mean and sample standard deviatin are 6.85 and 0.24, respectively. Is there enugh evidence t reject the cmpany s claim at the 0.08 level f significance?

Example: T-test (slutin) H 0 : μ = 6.8 (claim) H a : μ > 6.8 (ur suspicins) Assumptins: The 3 assumptins are satisfied (what are they?) Test statistic: (This is the standardized sample mean) t = x μ s n 6. 85 6. 8 = 0. 24 36 = 1. 25 p = 0.110, df = 35 Cnclusin: Since p > 0.08, fail t reject the null hypthesis! At the 8% level f significance, we d nt have enugh evidence t reject the industrial cmpany s claim that the mean ph level in NOVA s pnd (filled with geese) is equal t 6.8.

Example: Critical-Value (slutin) H 0 : μ = 6.8 (claim) H a : μ > 6.8 (ur suspicins) Critical-Value Apprach: Using either a t-table r the invt(0.08, 35) cmmand n yur calculatr, the critical-value is 1.44 (this is a right-tailed test). Cnclusin: Since the test statistic t = 1.25 des NOT lie in the rejectin regin, we fail t reject the null hypthesis! At the 8% level f significance, we d nt have enugh evidence t reject the industrial cmpany s claim that the mean ph level in NOVA s pnd (filled with geese) is equal t 6.8.

Example: T-test A cmpany manufactures ball bearings that are t be 1.25 inches in diameter. A randm sample f 81 ball bearings yields a mean diameter f 1.247 inches. With a sample standard deviatin f 0.0118 inches, is there evidence t cnclude that the manufacturing prcess needs adjusting? Test the claim at the 0.05 level f significance

Example: T-test (cntinued) H 0 : μ = 1.25 (manufacturing des nt need adjusting) H a : μ 1.25 (manufacturing needs adjusting) Assumptins: The 3 assumptins are satisfied (what are they?) Test statistic: (This is the standardized sample mean) t = x μ s n 1. 247 1. 25 = 0. 0118 81 = 2. 29 p = 0.025, df = 80 Cnclusin: Since p 0.05, reject the null hypthesis! At the 5% level f significance, we have sufficient evidence t cnclude that the manufacturing prcess needs adjusting.

Example: Critical-Value (slutin) H 0 : μ = 1.25 (manufacturing des nt need adjusting) H a : μ 1.25 (manufacturing needs adjusting) Critical-Value Apprach: Using either a t-table r the invt(0.025, 80) cmmand n yur calculatr, the criticalvalues are -1.99 and 1.99 (this is a tw-tailed test). Cnclusin: Since the test statistic t = -2.29 lies in the rejectin regin, we reject the null hypthesis! At the 5% level f significance, we have sufficient evidence t cnclude that the manufacturing prcess needs adjusting.

Example: Nn-pled t-interval (cntinued) Example: T-test A An natural emplyment fllw-up infrmatin questin is, service hw much says imprvement that the mean was annual made pay in fr reading full-time ability female frm these wrkers directed ver age 25 activities? and withut Nte high that schl the hypthesis diplmas test is nly $22,000. tells us The whether annual pay imprvement fr a randm in reading sample was f 12 made full-time r nt, female but it desn t wrkers infrm withut the high research schl n diplmas hw much is imprvement listed belw. was At the made. 0.05 T level answer f significance, this questin, test we the calculate claim that a 95% the cnfidence mean salary interval. is $22,000, with suspicins that the actual mean salary is less than this amunt. Cnclusin: We are 95% cnfident that the mean difference in reading skills between the treatment and cntrl 19,009 22,790 18,328 18,161 16,631 23,028 grups is between 10.42 and 29.01 units. The margin f errr 19,114 is ±9.23. 17,176 17,503 19,764 19,316 20,801

Example: T-test (cntinued) [1] We first enter the data in a list: Stat > Edit > Enter data in L 1 [2] Nw we cnduct a T-test: Stat > Tests > T-test > Data

Example: T-test (slutin) H 0 : μ = $22,000 (claim) H a : μ < $22,000 (ur suspicins) Assumptins: The 3 assumptins are satisfied (what are they?) Test statistic: t = 4.58 p < 0.001, df = 11 Cnclusin: Since p 0.05, reject the null hypthesis! At the 5% level f significance, we have sufficient evidence t cnclude that the mean annual pay amng all female wrkers withut high schl diplmas is less than $22,000.

Example: Critical-Value (cntinued) H 0 : μ = $22,000 (claim) H a : μ < $22,000 (ur suspicins) Critical-Value Apprach: Using either a t-table r the invt(0.05, 11) cmmand n yur calculatr, the critical-value is -1.80 (this is a left-tailed test). Cnclusin: Since the test statistic t = -4.58 lies in the rejectin regin, we reject the null hypthesis! At the 5% level f significance, we have sufficient evidence t cnclude that the mean annual pay amng all female wrkers withut high schl diplmas is less than $22,000.

Mre n p-values

Make a decisin Determine whether the researcher shuld reject r fail t reject the null hypthesis in each f the fllwing cases: 1. A right-tailed hypthesis test at level α = 0.05, and the p-value is 0.02 2. A tw-tailed hypthesis test at level α = 0.05, and the p-value is 0.05. 3. A left-tailed hypthesis test at level α = 0.01, and the p-value is 0.08.

Make a decisin Determine whether the researcher shuld reject r fail t reject the null hypthesis in each f the fllwing cases: 1. A right-tailed hypthesis test at level α = 0.05. The critical value is 1.645, and the test statistic is 2.49. 2. A tw-tailed hypthesis test at level α = 0.05. The critical value are -1.96 and 1.96, and the test statistic is -2.01. 3. A tw-tailed hypthesis test at level α = 0.10, and the test statistic is -1.73.

Never 100% Sure In hypthesis testing, since ur decisin is based upn incmplete infrmatin (a sample rather than an entire ppulatin), there is always the pssibility that the researcher will make the wrng decisin. There are tw types f errrs: Type I Errr and Type II Errr.

Type I and II Errrs Frm the previus example, recall the fllwing hyptheses: H 0 : μ = $22,000 H a : μ < $22,000 A Type I errr ccurs if the null hypthesis is in fact true, but the results f the sampling lead us t reject the null hypthesis. A Type II errr ccurs if the null hypthesis is in fact false, but the results f the sampling lead us t fail t reject the null hypthesis.

Type I Errr (cntinued) A Type I errr ccurs when a true null hypthesis is rejected. In this case, a Type I errr wuld ccur if in fact μ = $22,000, yet the results f the sampling lead t the cnclusin that μ < $22,000. Interpretatin: A Type I errr ccurs if we cnclude that the mean salary f all female wrkers withut high schl diplmas is less than $22,000, when in fact the mean salary is equal t $22,000.

Type II Errr (cntinued) A Type II errr ccurs when a false null hypthesis is accepted. In this case, a Type II errr wuld ccur if in fact μ < $22,000, yet the results f the sampling lead t the cnclusin that μ = $22,000. Interpretatin: A Type II errr ccurs if we cnclude that the mean salary f all female wrkers withut high schl diplmas is equal t $22,000, when in fact the mean salary is less than $22,000.

*Definitin: Pwer

When t use Z- r T-test Fr small samples say, less than 15 as lng as the ppulatin is nrmal r clse t being nrmal. Fr mderate samples say, frm 15 t 29 as lng as the data d nt cntain utliers, and as lng as the ppulatin is nt far frm being nrmally distributed. Fr large samples 30 r greater the prcedure can be used withut restrictin.

Example: Nn-pled t-interval (cntinued) Summary A natural Basic lgic: fllw-up Assume questin the null is, hypthesis hw much Himprvement 0 is true, and was either made reject in Hreading 0 r fail ability t reject frm H 0 these directed activities? Nte that the hypthesis test nly tells us whether Null hypthesis imprvement usually in has reading an equality was made sign r nt, but it desn t infrm the research n hw much imprvement was made. T answer this questin, we calculate a 95% cnfidence Researcher interval. typically wants a small p-value Cnclusin: Tw appraches: We are 95% (1) p-value cnfident r (2) that critical-value the mean difference in reading skills between the treatment and cntrl grups Tw types is between f errrs: 10.42 Type and I and 29.01 Type units. II The margin f errr is ±9.23.