The one-sample t test for a population mean
|
|
- Shannon Baldwin
- 6 years ago
- Views:
Transcription
1 Objectives Constructing and assessing hyotheses The t-statistic and the P-value Statistical significance The one-samle t test for a oulation mean One-sided versus two-sided tests Further reading: OS3, Sections 4.3 (using known oulation standard deviation) (using estimated oulation standard deviation). ISP: Sections 6.2 and 7.1.
2 Toics: Understanding hyothesis tests Learning objectives Be able to construct the aroriate null and alternative hyothesis based on what one wants to learn (the null hyothesis will always have the equal sign embedded in it). Understand that a statistical test is based on assessing the likelihood/lausibility of the data being generated under null hyothesis is true. If the robability is large then the null is lausible and we cannot reject the null hyothesis. It does not rove the null, but simly states that the null is ossible. If the robability is small the null seems imlausible and we reject the null hyothesis and determine the alternative to be true.
3 Examles of hyothesis We call H 0 the null hyothesis and H A the alternative hyothesis. Examles include: Comaring roduct reviews H 0 : The reviews of two roducts are the same H A : The reviews of two roducts are different. Wine consumtion and health H 0 : Regular consumtion of wine has no effect on olyhenol levels in the blood. H A : Regular consumtion of wine increases olyhenol levels in the blood. Birds and flight H 0 : Bird secies A cannot fly. H A : Bird secies A can fly.
4 Motivation
5 Examle 1: Flying Birds Based on emirical observations we want to answer the following question: Question Can Bird secies A fly? We write this as a hyothesis (conjecture) Null Hyothesis: H 0 : Bird secies A cannot fly. Alternative Hyothesis: H A : Bird secies A can fly. Based on what we observe, is the null lausible? Scenario1: You see one bird fly. You have immediately disroven the null (that this secies of birds cannot fly) and this roves the alternative. Scenario 2: None of the birds are flying (may be too much food to even attemt it). All this is consistent with the null being true, however it does not rove the null (they may fly later on). In this situation we say there is no evidence in the data to rove the alternative.
6 Setting-u the hyothesis
7 Examle 2: Comaring roduct reviews. q The Coffea (left) has an average review of 4.4 (over 261 customers). q The Smart Lintelek (right) has an average review of 4.8 (over 58 customers). q Over these customers Smart Lintelek scored highly. However, just comaring the samled customers does not take into account samling variability.
8 H 0 : The reviews of these two roducts are the same H A : The reviews of these two roducts are different. In an ideal world all tracker customers would rate both devices and we would be able to comare the mean ratings over both. Our hyothesis should be based on the mean ratings of all customers (which in reality can never be observed). Let denote the mean rating of the Coffea over all tracker customers. Let denote the mean rating of the Smart Lintekel over all tracker customers. µ C H 0 : µ C H A : µ C µsm The hyothesis we want to investigate is The null is that globally they would get the same mean ratings. We write this as µ SM = 0 same as H 0 : µ C = µ SM The alternative is that globally they would get different mean ratings. We write this as µ SM 6= 0 same as H A : µ C 6= µ SM
9 Examle 3: Buying a roduct q The Smart Lintelek has an average rating of 4.8 (over 58 customers). This is great, but the samle size is small. q I only buy roducts if I am confident that the oulation mean is not 4 or below. H 0 : µ ale 4.0 vs H A : µ>4.0 {z } I ll buy
10 Looking at the rating 4.8 it is clearly greater than 4.0. This is great. But there is also a doubt in my mind. Could it be that a (samle mean) of 4.8 can arise when the oulation mean is 4.0? I want to calculate the chance of this haening. We will calculate this chance both by hand and using Statcrunch. If the chance turns out to be small I can be sure that that the oulation mean is not 4.0 and must be greater. I will then go on to buy the roduct (since I have rejected the null).
11 The null hyothesis is a very secific statement about arameter(s) of the oulation(s). It is labeled H 0. This is the hyothesis we assess. The null should always have an equal sign in it. Either: =, or. The alternative hyothesis is a more general statement about the arameter(s) that is exclusive of the null hyothesis. It is labeled H A. In all hyothesis test the focus is only on the null hyothesis and assessing its lausibility based on the data.
12 Setting-u the hyothesis and understanding when it is immediately clear that the null is lausible and we cannot reject it.
13 Examle 4: Buying a roduct q The Teslasz has an average (samle mean) rating of 3.3. q I only buy roducts if I can be sure that the oulation mean of the reviews is over 4.0. H 0 : µ ale 4.0 vs H A : µ>4.0 {z } I ll buy q With a samle mean of 3.3. I am certainly not going to buy this one. There is no evidence in the samle the mean is greater than 4.0. I cannot reject the null.
14 H 0 : µ ale 4.0 vs H A : µ>4.0 {z } I ll buy q The samle mean will usually be close (in some sense) to the oulation mean. When the samle mean is 3.3 the true mean could easily be lie around 3.3, which is less than 4.0. q This tells us that null hyothesis is lausible and exlains why I cannot reject the null.
15 Question Time Question It is known that a freshman biology has mean score 75%. A rofessor thinks that students who attend early morning classes have a higher mean score. Her early morning class this year can be considered as a samle of all students who take an early morning class. What is the hyothesis of interest? (A) H 0 : μ 75% against H A : μ < 75%. (B) H 0 : μ 75% against H A : μ > 75%. (C) H 0 : μ = 75% against H A : μ 75%. (D) H 0 : μ < 75% against H A : μ 75%. htt:// fcc6
16 Question Time Question It is known that freshman biology has mean score 75%. A rofessor thinks that students who attend early morning classes have a higher mean score. Her early morning class this year can be considered as a samle of all students who take an early morning class. The samle mean (average grade) in her class is 78%. What is the hyothesis of interest. (A) H 0 : μ 78% against H A : μ < 78%. (B) H 0 : μ 78% against H A : μ > 78%. (C) H 0 : μ = 75% against H A : μ 75%. (D) H 0 : μ 75% against H A : μ > 75%. The stated hyothesis should never be based on the data. htt://
17 Question Time q The rice of gasoline has changed. Previously the mean yearly mileage of a vehicle was 4000 miles. I want to see whether the mean yearly mileage has changed after the rice change. What is the hyothesis of interest? q (A) H 0 : μ 4000 against H A : μ = q (B) H 0 : μ = 4000 against H A : μ q (C) H 0 : μ 4000 against H A : μ > q htt:// cb
18 Visually checking lausibility of the null
19 The null hyothesis is a very secific statement about arameter(s) of the oulation(s). It is labeled H 0. This is the hyothesis that we assess. Only if the null seems unlikely do we reject it. Reject the null and acceting the alternative are the same thing. A hyothesis test always checks the validity of the null. In the following Examle 5, we are going to ask if the numbers (observations) in each samle can arise if the null were true? If it seems unlikely, we reject the null and accet the alternative. If it seems lausible, then we do not reject null (though we do not say the null is true; recall Examle 1 with birds flying).
20 Examle 5: Benefits of wine? Wine consumtion and health H 0 : Regular consumtion of wine has no effect on olyhenol levels in the blood. H A : Regular consumtion of wine increases olyhenol levels in the blood. Of course different eole react in different ways. So we should not focus on the individual nor should we focus on only the articiants who took art in the study. We should focus on the oulation of interest (young males, say) and in articular the mean change in olyhenol levels over this entire oulation. µ Let denote the mean change (over the entire oulation) in olyhenol levels after consuming a small amount of red wine on a regular basis. H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen
21 Whenever we are given the data and the null hyothesis. We must always ask ourselves, could we have obtained that data set if the null hyothesis were true. If it seems that we can, then we cannot reject the null (we cannot say the alterative were true). In the following examles, look at the data and ask yourself could the data have been generated if the null were true. Later we will ut robabilities (called -values) to these notions.
22 White wine: Situation 1 9 males are given white wine. The difference in olyhenol levels before and after the study is: We lot the changes in the olyhenol levels for each individual on the time line (each blue sot corresonds to one observations). We see that there were some negative readings. These are ersons who observed a decrease in olyhenol levels some which are ositive. But every erson is different. The samle mean change is the green vertical line x = 0.7
23 H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen The aim of the study is to see whether consumtion of white wine increases olyhenol levels. But you find that for these 9 articiants the samle mean has droed. Well it is clear that for these guys we did not see an increase. We could easily have observed such a situation under the scenario that white wine has no effect on olyhenol. Though we cannot say the null is true, we cannot make any ositive claims about the alternative. Formally: Since x = 0.7 is consistent with the null hyothesis μ 0, there is no evidence to disrove the null. There is no evidence in the data that the olyhenol levels increase with moderate consumtion of white wine. In conclusion we cannot reject the null based on this data set.
24 Red wine: Situation 2 9 males are given red wine. The difference in olyhenol levels before and after the study is: We lot the change in olyhenol levels on the line. Every articiant observed an increase in olyhenol levels, all are over are 8.0. The samle mean is x =9.86. So for this grou of articiants there is a clear increase.
25 H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen Of course, the increases could just be by chance. The concentration of olyhenol in a ersons blood will always change. But it does seem highly unlikely that 9 eole will all observe a substantial increase in olyhenol under the scenario that the red wine did not have an effect. This samle does not aear to be a fluke. Formally we say: It really seems very unlikely we could have obtained this data under the null (oulation mean μ 0) and it strongly suggests that the alternative is true. The data strongly suggests drinking red wine increases olyhenol levels.
26 Red wine: Situation 3 The difference in olyhenol levels before and after the study is: 0.06, -0.36, 0.98, 0.82, -0.25, 2.49, -1.34, 1.16, We lot the change in olyhenol levels on the line. Some observed an increase others observed a decrease in olyhenol. The samle mean is x =0.56
27 H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen For these articiants there is a small overall increase. Could this data have been observed if wine had no influence on olyhenol level (in other words, if the null were true)? A formal statistical test (we do later) will hel us answer this question. Visual conclusion: Unsure
28 Red wine: Situation 4 The difference in olyhenol levels before and after the study is: -0.43, , 26.11, 4.32, 25.02, 9.40, 11.54, We lot the change in olyhenol levels on the line. Some observed an increase others observed a decrease in olyhenol. There is a lot of variability in the data. But the majority are ositive. x =6.66 The samle mean is.
29 H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen In order words, to rove the alternative we need to show that the data is unlikely to have been observed if the null were true. Looking at changes in olyhenol levels for the articiants, do you think they could have been observed when red wine has no influence on olyhenol? A formal statistical test (we do later) will hel us answer this question. Visual conclusion: Unsure. A statistical tests and tools allow us to systematically navigate these different scenarios
30 Question Time The hyothesis is H 0 : µ ale 1 H A : µ>1 The green line is the samle mean = 4.8 What is the conclusions of the test? (A) Reject Null (H A is true) (B) Cannot Reject Null (Cannot say H A is true) (C) Do not know. htt://
31 Question Time The hyothesis is H 0 : µ =1 H A : µ 6= 1 The green line is the samle mean = 4.8 What is the conclusions of the test? (A) Reject Null (H A is true) (B) Cannot Reject Null (Cannot say H A is true) (C) Do not know. htt://
32 Question Time The hyothesis is H 0 : µ ale 4 H A : µ>4 The green line is the samle mean = 2.4. These are the ratings of a roduct. What is the conclusions of the test? (A) Reject Null (H A is true, I buy) (B) Cannot Reject Null (Cannot say if H A is true, but I won t buy) htt://
33 Question Time The hyothesis is H 0 : µ 1 H A : µ<1 The green line is the samle mean = 4.8 What is the conclusions of the test? (A) Reject Null (H A is true) (B) Cannot Reject Null (Cannot say H A is true, but I won t buy) htt://
34 Question Time The hyothesis is H 0 : µ ale 4 H A : µ>4 The green line is the samle mean = 5 These are the ratings of a roduct. What is the conclusions of the test? (A) Reject Null (H A is true, I buy) (B) Cannot Reject Null (Cannot say if H A is true, I won t buy) (C) Do not know. htt:// 5
35 Question Time The hyothesis is H 0 : µ ale 4 H A : µ>4 The green line is the samle mean = 4.12 These are the ratings of a roduct. What is the conclusions of the test? (A) Reject Null (H A is true, I buy) (B) Cannot Reject Null (Cannot say if H A is true, I won t buy) (C) Do not know. htt://
36 Discussion It was retty clear what the answer should have been for most of the revious questions. But, the solution to the last question was unclear. This is where we require statistical tools. These tools will give us the chance of obtaining a samle mean of 4.12, when the oulation mean rating (amongst everyone who could have rated the roduct) was 4.0 or less. Interreting robabilities is very imortant.
37 Examle 6: Does the lady take milk? Recall the tea story in Chater 1: In the 1930s a lady, in Cambridge, insisted that the tea tasted different deending on whether milk was oured into the cu and then the tea or if the tea was first oured and then the milk. Fisher suggests that this can be statistically tested, by giving her tea where some cus are made with tea first and other cus are made with milk first and asking her to identify the cu. The cometing hyothesis are: H 0 : The lady has no idea and just guesses. H A: The lady is able to select the correct cu. They collect the data and find that she identifies all 8 cus of tea correctly. This is the observed information from which we have to draw a statistical conclusion. The chance of her identifying all cus correctly is 1/72 = 1.39% under the scenario she is guessing (this is the null hyothesis).
38 Motivation 2 (cont)? Assessing the robability: If the robability is over a threshold, then the null is deemed lausible and we cannot reject the null. If the robability is below the threshold then the null is deemed imlausible and we reject the null. Tyically, the α=5% significance level as used as the threshold. Since 1/72 = 1.39% is less than 5%, we believe the null is imlausible (at the 5% level) and thus reject it (saying that there is evidence to suggest the alternative, that she knows her tea, is true). However, we will never know the truth! If she did the exeriment 100 times and was simly guessing, then about 1.39 times out of a 100 she would correctly identify all cus correctly. Recall 5% is the roortion of times we are willing to reject the null, when in fact the null is true.
39 To summarize In order to rove the alternative we have to calculate how lausible (this is a robability) it is to correctly identify all the cus of tea correctly, under the null that she was simly guessing. How likely is one to collect the data that is observed under the scenario of the null being true. If this robability is small, then it suggests that the null is an imlausible scenario. If the null is an imlausible scenario, then this imlies the alternative is the lausible scenario (we say: there is evidence to suggest the alternative is true).
40 Toic: How to do a hyothesis test Learning objectives: Evaluating a robability Understand how to do a one-sided (both left and right) and two-sided test. Be able to connect the -values of a one-sided test with those of a twosided test. Be able to construct the correct test based on the summary statistics table. Be able to do the test in Statcrunch and interret the outut. Most tests use a t-distribution, but you should understand that a normal distribution is used when the oulation standard deviation is known. You should be able to check for normality of the samle mean based on a QQlot of the data set and using the samling distribution alet in Statcrunch. This will tell us whether the -values are correct or not.
41 The underlying rincile in a test A hyothesis test always checks the validity of the null; in other words, could the numbers in front of you arise if the null were true? In a hyothesis test we calculate the robability of observing the data under the scenario the null is true. Does the data disrove the null? The underlying idea of a hyothesis test is that events with small robabilities are unlikely to haen. If this robability turns out to be small, it suggests that the null assumtion made in the calculation is not true and the alternative is a more logical exlanation for the data.
42 The underlying rincile in a test In most statistical tests we encounter will based on the oulation mean. This may seem very simle, but it will allow us to test a wide range of useful hyotheses. Most calculations will be made using that the samle mean is normal, therefore we always need to check this assumtion else the robability we calculate will be incorrect. In the next few slides we will exlain how to calculate these robabilities. Using one and two sided tests.
43 One-sided tests
44 Examle 1 (one-sided test) A erson will only buy a roduct if they are sure the oulation mean rating is over 4. These are the hyothesis: H 0 : µ ale 4 H A : µ>4 This is an examle of a one-sided test Here is the data that was collected. The samle mean and samle standard deviation is 5 and 0. The samle size 31.
45 A erson will only buy a roduct if they are sure the oulation mean rating is over 4. These are the hyothesis: H 0 : µ ale 4 H A : µ>4 The samle mean and samle standard deviation is 5 and 0. The samle size 31. Since the samle standard deviation is 0, the samle standard error is 0 31 =0 To understand if the null is viable, we evaluate the number of standard errors the samle mean is from the mean under the null. This is the t- transform t = = 1 If the null is true, we comare the above with a t-distribution with 30 df.
46 The t-value The t-value is t = = 1
47 The white region gives ossible t-values if the null that were true. µ =4 µ<4 µ ale 4 Suose then the t-value is like to be less than Suose then the t-value is likely to be a lot less than For examle if the oulation mean is 1 and then the samle mean will be close to 1 so the t-transform as defined on the revious age will be negative.
48 The t-value from the data us t = = 1 The t-value is infinity it is at the very right of the blue tail. The area to the right of this is called the -value and is zero. This tells us it is imossible to obtain the samle mean 5, when the oulation mean is 4. Conclusion The null is imlausible and we reject the null. We do, however, have to be careful. This data is not normal, it is integer valued. This means the samle mean is not normal, so we have to be careful when using interreting the -value from using a t- distribution.
49 Returning to red wine: Situation 3 The difference in olyhenol levels before and after the study is: 0.06, -0.36, 0.98, 0.82, -0.25, 2.49, -1.34, 1.16, We lot the change in olyhenol levels on the line. Some observed an increase, others observed a decrease in olyhenol. x =0.56 H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen The samle mean is the samle std. dev = s = 1.14
50 Reminder: This is a one-sided test H 0 : µ ale 0 {z } vs H A : µ>0 {z } mean levels of olyhenols have stayed the same or reduced mean levels of olyhenols have risen This is an examle of a one-sided test A one-sided test is when the alternative hyothesis has a greater than or less than sign. Later we consider examles of two-sided tests. The way we aroach these two different tests are slightly different.
51 In order to rove the alternative, we have to calculate the likelihood of observing the samle mean under the scenario the null is true. The null is and the samle mean is estimating zero (red wine exerts no influence). Here we use the CLT. If the data comes from the normal distribution or the samle size is sufficiently large the average will be close to normal. Thus if the null is true the t-ratio For this data set (standard error = 1.14/3 = 0.38) µ =0 samle mean 0 standard error x =0.56 t = / 9 =1.47 = t distribution with 8df This is a measure of distance between the samle mean and the oulation under the scenario that the null is true.
52 For this data set the t-ratio is t = / 9 =1.47 The chance of this haening when the oulation mean is zero is This is what the distribution of the the t- value will look like if the samle mean is normal and the oulation mean is 0 (null is true). The -value = 8.9%
53 q This tells us that there is a 8.9% chance of observing the differences 0.06, -0.36, 0.98, 0.82, -0.25, 2.49, -1.34, 1.16, 1.53 in 8 individuals when over the entire oulation of males who consume red wine there is no mean change. q Since 8.9% is relatively large, the null is lausible (but it does not rove the null). q We cannot reject the null. There is no evidence in the data to back the claim that red wine consumtion increases the mean olyhenol levels. Usually 5% is used as the decision rule. If the -value is less than 5%, we deem the chance small and reject the null at the 5% level. Since 8.9% > 5%, for this data set we cannot reject the null at the 5% level. Warning: 5% is the roortion of times we are will reject the null, when it is true.
54 Reca: The P-value (for one sided test) Definition We want to quantify the roortion of random samles that are at least as unusual as our actual result, if the null hyothesis were true. This quantity is called the -value. The - value (for a one-sided test, which this is) is the area greater than the t-value (since the alternative contains a greater than sign). Red wine Examle: -value = area greater than t-value = 8.9% Since 8.9% > 5% we deem this robability large. For this data set, there is not evidence in the data to suggest that regular consumtion of red wine increases olyhenol levels.
55 Statcrunch Load data into statcrunch Go to Stat -> T Stat -> One Samle (a dro down menu) Select column (choose the data sets) Perform (choose the hyothesis) Press comute
56 Understanding -values from the ersective of rejection regions and boundary of decisions This art is only necessary to understand what statistical ower means.
57 Red wine: One-sided boundary of decision If -value > 5% we cannot reject the null. However, if -value <5% then we reject the null. α=5% is the boundary of the decision. This means the area on the right (since this is a one-sided test) should be less than 5%. This corresonds rejecting the null for any t-transform that is larger than Remember that the t-transform = number of standard errors the samle mean is from the oulation mean under the null. If the null is true the t-transform is small and 1.86 is considered too large (based on the 5% decision rule)
58 1.86 standard errors from the null corresonds to the the samle mean X = = 0.71 Therefore t-values greater than 1.86 corresond to samle means greater than If the null is true, samle means less than blue bar are lausible (at the 5% level). And we reject the null if the samle mean is greater than 0.71 (over the blue bar),
59 Summary We reject the null the samle mean is greater than *0.38 = 0.71 We do not reject the null if the samle mean is less than In this examle, since the samle mean 0.56 is less than 0.71 we cannot reject the null (it is on the left of the blue bar).
60 Red wine: All 4 situations Here the data is lotted for each of the situations. The green line is the samle mean. Focus on the sread of each data set.
61 Matching -values to the red wine examles (Examle 5)
62 For each situation suerimose a bell shae curve centered at zero (see the next slide). Focus only on the right hand side of zero, because we are only looking for evidence of the red wine causing an increase in olyhenol levels.
63
64 Recall the hyothesis is H 0 : µ ale 0 {z } mean levels of olyhenols have stayed the same or reduced vs H A : µ>0 {z } mean levels of olyhenols have risen Cannot reject Null Reject Null Cannot reject Null Cannot reject Null
65 We test H 0 : μ 0 vs H A : μ > 0. For each case we assess the likelihood of observing the data under the null hyothesis μ 0. We see Situation 1: -value = 98%. It is highly likely to see data like this if red wine did not increase olyhenol levels. Conclusion: cannot reject the null. Situation 2: -value <0.01%. It is highly unlikely to see data like this if red wine did not increase olyhenol levels. Conclusion: reject null, strong evidence of alternative. Situation 3: -value = 8.9%. Data like this can be seen when μ 0. Conclusion: Cannot reject null. Situation 4: -value = 7.67% Data like this can be seen when μ 0. Conclusion: Cannot reject null.
66 Examle: Buying a roduct q I only buy roducts if I can be certain that their oulation mean reviews are over 4.0. H 0 : µ ale 4.0 vs H A : µ>4.0 {z } I ll buy
67 The summary statistics for the tracker is 4.25 is 2.4 standard errors to the right of the null: t = =2.4 The -value is the area to the right of 2.4, which is 0.8%. Since 0.8%<5% we reject the null. If the reviews are reresentative of all the eole who bought the roduct then there is evidence from the samle that the oulation mean review is greater than 4.
68 Question Time Below is the summary statistics for the Coffea watch. Suose we will only buy the watch if we can be sure the mean review is over 4.0. What is the hyothesis of interest, the t-value, -value and conclusion of the test (use the 5% level)? Use that the t-value for t-distribution with 260 df are (A) H 0: µ 4.0 H A: µ > 4.0. the t-value is 2.2 the -value is between 1%-2.5%. We reject the null and I buy the roduct. (B) H 0: µ 4.0 H A: µ > 4.0. the t-value is 0.13 the -value greater than 30% I will not buy the roduct. htt://
69 Question Time Entomologists want to understand the number of chirs a minute a cricket makes. They conjecture that it is less than 17 chis er minutes. They collect the data on 15 crickets. The data is summarized above. What is the hyothesis of interest, the t-value, the -value and result of test (use t-distribution with 14df). (A) H 0: µ 16.6 H A: µ < The t-value is The -value is more than 15% we cannot reject the null. We cannot say the oulation mean is less than 17. (B) H 0: µ 17 H A: µ < 17. The t-value is The -value is more than 15% we cannot reject the null. We cannot say the oulation mean is less than 17. (C)H 0: µ 17 H A: µ > 17. The t-value is The -value is more than 50% we cannot reject the null. We cannot say the oulation mean is less than 17. htt://
70 Two-sided tests
71 Examle: Tomatoes 1 Examle: You are in charge of quality control in your food comany. You randomly samle fourteen acks of cherry tomatoes, each labeled 224 grams. The average weight from your fourteen boxes is 226.1g. Obviously, we cannot exect boxes filled with whole tomatoes to all weigh exactly 224 grams. x Is the somewhat larger samle mean simly due to chance variation? Or is it evidence that the machine that sorts the cherry tomatoes into ackages needs to be recalibrated? The hyothesis: H 0 : µ = 224g (µ is equal to the value claimed by the roduce comany) H A : µ 224g (µ is either larger or smaller than the value claimed)
72 This is a two-sided test H 0 : µ = 224 vs H A : µ 6= 224 This is a two-sided test. This is because there is a not equal sign in the alternative hyothesis.
73 H 0 : µ = 224 vs H A : µ 6= 224 This is the data suerimosed with a normal distribution centered about the mean 224g. We see the data seems to be slightly shifted to the right. We will test if this is statistically significant.
74 H 0 : µ = 224 vs H A : µ 6= 224 After collecting the data, the basic rescrition is to make a z/ttransform. t = 0 X µ A {z} mean under the null s.e t = =2.07
75 How unusual is this data, assuming it is roerly calibrated (null is true)? We calculated that the samle mean is t = 2.07 standard errors from the mean under the null. The area to the right of 2.07 is 2.9%. Samles that are roerly calibrated and are at least as unusual as this have t-value that is either greater than 2.07 or less than The chance of this is the area to the right of 2.07 or area to the left of , which is = 5.8%.
76 Definition: The P-value (for two sided test) Definition We want to quantify the roortion of random samles that are at least as unusual as our actual result, if the null hyothesis were true. This quantity is called the -value. The -value (for a two-sided test, which this is) is 2 the smallest area. Tomato Examle: -value = 2 smallest area = =5.8% Since 5.8% > 5% we deem this robability large. For this data set, we cannot reject the null. We will not investigate the tomato acking machine. There is always the ossibility that the conclusion is incorrect. Further reading: htt://onlinestatbook.com/2/tests_of_means/single_mean.html
77 Always try to match the calculation you have made with the Statcrunch outut for the same roblem. We calculated t = =2.07 Which using a t-distribution with 13 dfs the -value is betweem 5-10% using tables or with comuter exactly 5.8%. We cannot reject the null. Statcrunch gives the same result. It is imortant to ma the calculation to the statcrunch outut.
78 Examle: Tomatoes 2 Examle: You are (again) in charge of quality control in your food comany. You randomly samle fourteen acks of cherry tomatoes, each labeled 224 grams. The average weight from your fourteen boxes is 221.7g. Obviously, we cannot exect boxes filled with whole tomatoes to all weigh exactly 224 grams. Is the somewhat smaller weight simly due to chance variation? Or is it evidence that the machine that sorts the cherry tomatoes into ackages needs to be recalibrated? The hyothesis: H 0 : µ = 224g (µ is equal to the value claimed by the roduce comany) H A : µ 224g (µ is either larger or smaller than the value claimed)
79 The tomato machine data (2) The next data we observe this data The summary statistics are The samle mean is 221.7g (average of 14 boxes).
80 H 0 : µ = 224 vs H A : µ 6= 224 This is the data suerimosed with a normal distribution centered about the mean 224g. We see the data seems to be shifted to the left. We will test if this is statistically significant.
81 The basic rescrition H 0 : µ = 224 vs H A : µ 6= 224 After collecting the data, the basic rescrition is to make a z/ttransform. t = X µ A {z} mean under the null s.e This is a summary of the statistics: 1 t = = 9.3
82 How unusual is this data assuming it is roerly calibrated (null is true)? We calculated that the samle mean is t = -9.3 standard errors from the mean under the null. The area to the left of -9.3 is almost 0%. Samles that are roerly calibrated and are at least as unusual as this have t-value that is either greater than 9.3 or less than The chance of this is the area to the right of 9.3 or area to the left of -9.3, which is 2 0 = 0%.
83 Since 0% < 5% we deem this robability very small. It is very, very hard to get this tye of data under the scenario that the machine is roerly calibrated and working. Thus there is strong evidence that the tomato acking machine is not acking correctly and the machine will have to be recalibrated.
84 Connecting two sided tests and confidence intervals The results of a test at a certain significance level and confidence intervals are closely related. We use the two tomato examles to illustrate the connects. We recall the hyothesis is H 0 : µ = 224 vs H A : µ 6= 224 Tomato 1: Summary statistics The 95% confidence interval for the mean is [226.1± ] = [223.9, 228.3]
85 H 0 : µ = 224 vs H A : µ 6= 224 Tomato 1: The 95% confidence interval for the mean is [226.1± ] = [223.9, 228.3] The confidence interval gives lausible values for the mean. This means that 224 is a lausible mean. We cannot discount the null hyothesis. If the mean under the null is inside the 95% confidence interval, then for a two-sided test the -value is greater than 5% and we cannot reject the null. Similarly if the mean under the null is inside a 99% confidence interval for the mean, then the -value for a two sided test is greater than 1%.
86 H 0 : µ = 224 vs H A : µ 6= 224 Tomato 2: The summary statistics is Based on the data the 95% confidence interval for the mean is [221.8± ] = [221.13, 222.5] The interval tells us where the oulation mean is likely to like. 224g is not in this interval. This suggests that 224g is not a lausible mean. Since 224g is not in the 95% confidence interval for the mean, the - value for the two sided test is less than 5%. If 224g is not in the 99% confidence interval for the mean, the -value for the two-sided test will be less than 1%.
87 The above arguments do not hold for one-sided test. The relationshi between one-sided tests and confidence intervals is more comlicated and will not be covered in this class.
88 Question Time The Windchill factor in a certain area is measured over a eriod of 216 days. The summary statistics and the critical values for the t-distribution with 215 degrees of freedom are given below.
89 Linking the different sided tests We recall for a given data set and oulation mean we can do three different tests. However, the results of all the tests are closely related. Situation 1: The results for: H 0 : μ 0 against H A : μ > 0 is q Suose we want to test the hyothesis that red wine decreases olyhenol levels. Then our hyothesis of interest is H 0 : μ 0 against H A : μ < 0. The -value for this test can easily be deduced from the above table. q q The t-value is the same. The -value is different. q The -value is the area to the LEFT of -2.45, which is = 2%.
90 o Testing H 0 : μ 0 against H A : μ < 0. Since the -value 2% < 5% there is some evidence based on this data set that red wine decreases olyhenol levels. o If we test H 0 : μ = 0 against H A : μ 0, the -value is 4% and there is evidence to suggest the mean is not zero.
91 Question Time Exerts conjecture that the weighting time between erutions of Old Faithful is more than 68 minutes. What is the hyothesis of interest and the -value (using the above outut). q (A) q (B) q (C) H 0 : µ ale 68, H A > 68 H 0 : µ ale 70.9, H A > 70.9 H 0 : µ ale 68, H A > 68 the -value is 0.05%. Reject the null. the -value is 0.1%. Reject null the -value is 0.025%. Reject null q htt://
92 Question Time Exerts conjecture that the weighting time for between erutions of Old Faithful less than 68 minutes. What is the hyothesis of interest and the -value (using the above outut). htt:// q (A) q (B) null H 0 : µ 68, H A < 68 H 0 : µ 70.9, H A < 70.9 the -value is 99.95% reject null the -value is 0.05% cannot reject q (C) H 0 : µ 68, H A < 68 the -value is 99.75%, cannot reject null.
93 Question Time (one-sided) Let µ denote the (oulation) mean level of glucose in an exectant mother. If µ > 140 gestational diabetes is diagnosed. The hyothesis we want to test is H 0 : µ ale 140 H A : µ>140 6 blood samles are taken. The results are summarized above. What is the result of the test at the 5% level (use t-distribution with 5df)? (A) The t-value is 1.64 and the -value is between 5-10%. We can reject the null and diagnose diabetes. (B) The t-value is 1.64 and the -value is between 5-10%. We cannot reject the null. The data does not suggest she has gestational diabetes, she could have got a samle mean of 142 even if she were well. htt://
94 When to use the normal distribution instead of a t-distibution in a statistical test
95 Examle: Using the normal distribution Low Potassium Hyokalemia is diagnosed when the blood otassium level is below 3.5mEq/dl. The otassium in a blood samle varies from samle to samle and follows a normal distribution with unknown mean. However, several years of data means that the standard deviation (the variation between samles) is known to be 0.2. Since the standard deviation is known and not estimated from a samle we use a normal distribution instead of a t-distribution (look back at chater 6). As we looking for evidence of low otassium the hyothesis of interest is H 0 : μ 3.5 against H A : μ<3.5. This is a one-sided test.
96 Examle: Using the normal distribution: Low Potassium We test H 0 : μ 3.5 against H A : μ<3.5. A atient has 9 blood samles taken, their samle mean/average is 3.4, is there evidence to suggest low otassium (use 5% significance level)? The standard error is 0.2/ 9 = Below we lot the distribution of the samle mean if the null were true. Left: Distribution of samle mean under the null. The -value is in red. The -value is 6.6%.
97 To calculate the -value using the z-tables, we make the z-transform, which is identical to a t-transform. We simly use a different tables to get the -values z = (s.e =0.2/ 9) = = 1.5 Looking u the z-tables (remember the standard deviation is known) gives the -value 6.68%. As this is greater than 5% we cannot reject the null. Desite the erson having a samle mean below 3.5, such a samle can be collected when their true mean is 3.5. Thus there is not enough evidence that the erson has low otassium. Consequence We do not subject the erson to more medical checks.
98 Examle: Gestational diabetes A atient has gestational diabetes if the mean glucose level of the atient is over 140. We are looking for evidence of gestational diabetes. The test is H 0 : μ 140 against H A : μ > 140. μ is never known. All we have are the results from a few blood samles. However, it is known that the amount of glucose in blood is normally distributed with known standard deviation with σ=4. A atient goes to the doctors. We do not know if she has gestational diabetes (μ is unknown). The glucose level in her blood samles is assumed to normally distributed with σ=4. After taking 4 blood samles her samle mean is 145. Is there evidence that she has gestational diabetes?
99 Examle: Gestational diabetes We want to test H 0: μ 140 against the alternative H A: μ > 140. Based on the data can we disrove that she is healthy. To this we need to know the variability in the samle mean, this is quantified by the standard error = 4/ 4 = 2. Next we have to calculate how far her samle mean is from the mean if she were healthy: z-transform = ( )/2 = 2.5 (we call it a z-transform rather than a t-transform because we know the standard deviation). Since the alternative is ointing to the right, we need to calculate the robability to the right of 2.5. From the z-tables this is 0.6%. 0.6% is quite small. It says the chance of getting a samle mean of 145 or higher, when the atient does not have gestational diabetes is 6 in a Since 0.6% < 5% (it is very small), we disrove the null. There is strong evidence from her blood samles that she has gestational diabetes.
100 Question Time Low otassium is diagnosed if the mean level in a erson is less than 3.5. The standard deviation of a given blood samle is known to be (0.3, this means use a normal distribution). The hyothesis of interest is H 0 : µ 3.5 H A : µ<3.5 A erson has 4 blood samles taken. The samle mean is 3.0. If there any evidence they have low otassium (use the 5% level)? (A) The z-value is z = The -value is 0.04%, this is so small, there is strong evidence to suggest they have low otassium (reject null). (B) The z-value is z = The -value is 0.04%. This is so small, we cannot reject the null. (C) The z-value is z = The -value is 4.7%. There is some evidence to reject the null and determine they have low otassium. htt://
101 Choice of level
102 Deciding the conclusion with α A very small P-value indicates that our results robably did not occur when the null hyothesis is true, and therefore H 0 is imlausible. It should be rejected. In this case we say the evidence is significant. The smaller the P-value the stronger the evidence against H 0. The significance level α is the largest P-value for which we are willing to reject the null hyothesis. The value of α is decided before conducting the test. If the P-value is equal to or less than α then we reject H 0. This is when we accet H a as the truth. If the P-value is greater than α then we fail to reject H 0. Whatever evidence there is, it is not sufficient to accet H a. Tyically we set α=5%.
103 Comments on the decision rule The objective of a test is to make a decision between the lausibility of two cometing hyothesis. The -value is the robability of observing the data under the assumtion the null hyothesis is true. If the -value is less than the significance level (often set at 5%). The decision is to reject the null and go for the alternative instead. If the -value is greater than 5% than the data is consistent with the null being true and we cannot reject the null. The oint is there is a chance we made the wrong decision. We could have wrongly rejected the null when actually the null is true. The chance of this haening is the significance level. In other words, if we set the significance level at 5% and our -value is less than 5% there is 5% chance we have made the wrong decision.
104 The value at which we set the significance level determines how willing we are to wrongly reject the null hyothesis. Examles: Suose we are in a tomato acking lant. Our aim is to ensure that the mean weight of a tomato box is 227g. Every few hours we randomly samle 14 boxes of tomatoes and do a hyothesis test. Each test is done at the 5% level. We do the test 100 times, if the null hyothesis is true, then on average we would falsely reject the null 5 times. Each time we falsely reject the null, it is called a tye I error or in medical terms a false ositive. Suose we reduce the significance level to 1%, in this case if the null were true we would falsely reject the null 1 time out of a hundred.
105 We will show in Chater 8 that by increasing the significance level (from, say 5% to 10%) we increase the number of false ositives, but we are more likely to detect the alternative (if it is true). Decreasing the significance level will have the oosite effect. The -value is measuring the level of evidence against the null. The smaller the -values the more the evidence against it.
106 The Significance level How to choose the significance level? There is a trade off between not wanting to falsely reject the null but wanting to detect the alternative. The lower the significance level, the less likely we are to falsely reject the null, but this makes detecting the alternative much harder! Examle: Consider the court case H 0: Innocent H A: Guilty. The -value is the robability of observing the evidence given the null is actually true. If we set the significance level at 5%. Then a erson is determined guilty if the -value is less than 5%, This means 5% ercent of all innocent eole who were ut on trial will be determined gulity. This is too much! To avoid convicting such a large roortion of guilty eole we need to reduce the significance level.
107 If the significance level is ut to zero, this means that no one who is innocent is ut into jail. However, it also means that all guilty eole are free. In other words no amount of evidence is enough to convict a erson. What significance level seems reasonable in this case? 0.01%? This choice of significance level deends on the alication. 5% is reasonable for a tomato acking lant (we can afford to check a machine several times), but too large for a conviction.
108 Checking reliability of the -value
109 How reliable are these -values? Remember, to calculate the -values we have used the normal or t- distribution (deending on whether the oulation standard deviation is known or not). q Underlying these calculation is the assumtion that the samle mean is normally distributed (remember we always make a lot of of the normal distribution and center it about the mean under the null). If the samle size is not large enough, the central limit theorem will not have `kicked-in. Then the samle mean won t be normally distributed. This means the robabilities we have calculated won t be reliable just like the 95% CI for the mean won t really be a 95% confidence interval. In this case we must be cautious in interreting the results of the test.
110 Nevertheless: If the -value is extremely small (say ), it would be small even if the correct distribution of the samle mean were used. On the other hand, if the -value is close to the 5% significance level we need to careful about its statistical significance (since the correct distribution may mean the true -value is greater than 5%).
111 Examle: Siblings The university is interested in the (oulation) mean number of younger siblings a student has at the university (in the hoe that they will attended the university). They believe that the mean is greater than To test this hyothesis, H 0: μ 0.25 against H A: μ> 0.25 they randomly samle 3 students ask them how many siblings they have, they answer 0, 1, 3. The samle mean is 1.33 and the samle standard deviation is Question: What are the conclusions of the test at the 10% level and comment on the reliability of the result. Answer: The t-transform is t = ( )/(1.53/ 3) = Using the t- tables (with 2df) we see this lies somewhere between 15-20%. Since the alternative hyothesis is ointing RIGHT this means the -value is between 15-20%. Now we comment on the reliability of this -value. In HW9, Q1 we made lot of the samle mean (based on size 3) for younger sibling numbers.
112 q q The distribution of the samle mean is the lowest lot on the left, this is clearly not normal (see also the corresonding QQlot). This means that the -value is not correct, it is based on normality when the samle mean is not normal. This means we have to be very careful when we interret this -value. q We recall if the samle size is larger (in Q2, Quiz 9 we looked at samle size n = 150), then samle mean is close to normal and we corresonding -value will be closer to the truth (as it if came from the true distribution of the samle mean).
113 Lab ractice Out aim is to make inference about the mean weight of a newborn calf based on the samle mean of 44 calves. We first make a histogram of the data, to see if there are any major deviation from normality.
114 The distribution of weights at birth does not have a obvious skew or thick tail. This means that distribution of the samle mean based on a samle of 44 will be very close to normal. So we can rest assured that using the t-distribution (since the standard deviation is unknown) will be reliable. Question: Based on the data is evidence to suggest the mean weight of calves is greater than 90 ounds? We test H 0: μ 90 against H A : μ > 90.
115 We deduce the -value in Statcrunch. The -value 0.44%. Since 0.44% < 1% level, we reject the null at the 1% level (the alternative is true). This means there is strong evidence in the data to suggest the mean weight of calves (of that breed) is greater than 90 ounds.
116 Connecting confidence intervals and statistical tests. This is old material and will not be tested or covered.
117 q Two-sided tests and confidence intervals There is a close connection between confidence intervals and two-sided tests. Let us return to the one bed aartment in Dallas examle. 10 aartments are randomly samled. The samle mean and the samle standard deviation based on this samle is 980 dollars and 250 dollars (both are estimators based on a samle of size ten). The 95% confidence interval for the mean is [980± ]=[801,1159]. Suose we want to know whether the rice of aartments has changed since last year, where the mean rice was 850 dollars. q Based on this interval we see that 850 dollars is contained in this interval. This means the mean could be 850 dollars. There given the samle it is unclear whether the mean rice of aartments is the same since last year or not. q We can rewrite the above as a statistical test H 0: μ = 850 against H A : μ 850. The t-transform is t = ( )/79 = Looking at the t-distribution, we see that 1.64 < (this is the t-value corresonding to 9df at 2.5%). Therefore, the -value is greater than 5%. Thus we cannot reject the null at the 5% level. q Further reading: htt://onlinestatbook.com/2/logic_of_hyothesis_testing/sign_conf.html
118 Summarizing these two observations we see that: 850 lies inside the 95% confidence interval [801,1159]. We are unable to reject the null at the 5% level. If the mean under the null lies in the 95% confidence interval, then this imlies the corresonding -value will be greater than 5%. On the other hand, if the mean under the null does not lie in the 95% confidence interval its -value will be less than 5%. This is easily seen with an illustration (see later slides). If 850 is in an interval centered about 980 (where each side has length 178.7). Then 980 must be the interval centered about 850 with sides of length A few slides earlier we showed that this interval [850± ]=[671,1028] corresonded to oints where we make a decision to reject the null or not at the 5% level. In general, if the mean under the null lies in a (1-α) 100% confidence interval, then the -value for a two sided test will be greater than α.
119 Confidence intervals and one-sided tests Consider the olyhenol and red wine examle considered in Chater randomly samled men were asked to drink red wine every day for two weeks. Their change in olyhenol levels was measured: 0.7, 3.5, 4.0, 4.9, 5.5, 7,0, 7.4, 8.1, 8.4, 3.2, 0.8, 4.3, -0.2, -0.6, 7.5. The average change is 4.3 and samle standard deviation is Review: Two-sided tests and confidence intervals The 95% confidence interval for the change in olyhenol levels is [2.6,5.99]. This means if I am testing the hyothesis H 0 :μ = 0 against the alternative H A : μ 0, since 0 is not in the interval the -value is less than % = 5%. The 99% confidence interval for the chance in olyhenol levels is [1.94,6.66]. This means if I am testing the hyothesis H 0 :μ = 0 against the alternative H A : μ 0, since 0 is not in the interval the -value is less than % = 1%.
120 q One Sided test (ointing RIGHT) Suose we are testing that olyhenol levels increase. This means testing the hyothesis H 0 :μ 0 against the alternative H A : μ > 0. The -value is the area to the right of 4.3 (see that the alternative is ointing to the right). Since from above we have deduced that in the two sided test the -value is less than 5%, so for the one-sided the -value is less than 2.5%. q Why? Recall the -value for two-sided tests is the smallest area to the left/right of of the t-transform times 2. In this case it is the area to the right of 4.3 times 2. For the two sided test we have deduced that the -value is less than 5%, this imlies that the area to the RIGHT of 4.3 is less than 5/2 = 2.5%. The -value for the one-sided test ointing to the RIGHT is the area to the right of 4.3. We have just shown that the area to the right of 4.3 less than 2.5%. Thus the -value for the one-sided test ointing to the RIGHT is less than 2.5%.
121 q One Sided test (ointing LEFT) Suose we are testing that olyhenol levels decrease. This means testing the hyothesis H 0 :μ 0 against the alternative H A : μ < 0. Since 4.3 is not in the 95% confidence interval this means the -value is greater than 97.5% (there is no evidence to reject the null which is clear 4.3 lies within the null hyothesis). q Why? On the revious slide we showed that the -value for the hyothesis ointing to the RIGHT is less than 2.5% - the area to the RIGHT of 4.3 is less than 2.5%. The -value for the test ointing to the LEFT is the area to the LEFT of 4.3. Which has to be greater than 97.5% (since the area to the left lus the area to the right is 100%). But this is obvious. The oint of a test is to see how lausible the data is under the null. If the samle mean is 4.3 and the null is that the true mean is greater than or equal to 0, this is highly lausible! If this is highly lausible we cannot reject the null.
Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)
Objectives 6.1, 7.1 Estimating with confidence (CIS: Chater 10) Statistical confidence (CIS gives a good exlanation of a 95% CI) Confidence intervals. Further reading htt://onlinestatbook.com/2/estimation/confidence.html
More informationObjectives. Estimating with confidence Confidence intervals.
Objectives Estimating with confidence Confidence intervals. Sections 6.1 and 7.1 in IPS. Page 174-180 OS3. Choosing the samle size t distributions. Further reading htt://onlinestatbook.com/2/estimation/t_distribution.html
More information7.2 Inference for comparing means of two populations where the samples are independent
Objectives 7.2 Inference for comaring means of two oulations where the samles are indeendent Two-samle t significance test (we give three examles) Two-samle t confidence interval htt://onlinestatbook.com/2/tests_of_means/difference_means.ht
More informationMeasuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)
Objectives 1.3 Density curves and Normal distributions Density curves Measuring center and sread for density curves Normal distributions The 68-95-99.7 (Emirical) rule Standardizing observations Calculating
More informationMeasuring center and spread for density curves. Calculating probabilities using the standard Normal Table (CIS Chapter 8, p 105 mainly p114)
Objectives Density curves Measuring center and sread for density curves Normal distributions The 68-95-99.7 (Emirical) rule Standardizing observations Calculating robabilities using the standard Normal
More informationCHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit
Chater 5 Statistical Inference 69 CHAPTER 5 STATISTICAL INFERENCE.0 Hyothesis Testing.0 Decision Errors 3.0 How a Hyothesis is Tested 4.0 Test for Goodness of Fit 5.0 Inferences about Two Means It ain't
More informationEcon 3790: Business and Economics Statistics. Instructor: Yogesh Uppal
Econ 379: Business and Economics Statistics Instructor: Yogesh Ual Email: yual@ysu.edu Chater 9, Part A: Hyothesis Tests Develoing Null and Alternative Hyotheses Tye I and Tye II Errors Poulation Mean:
More informationMATH 2710: NOTES FOR ANALYSIS
MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite
More informationOne-way ANOVA Inference for one-way ANOVA
One-way ANOVA Inference for one-way ANOVA IPS Chater 12.1 2009 W.H. Freeman and Comany Objectives (IPS Chater 12.1) Inference for one-way ANOVA Comaring means The two-samle t statistic An overview of ANOVA
More informationEcon 3790: Business and Economics Statistics. Instructor: Yogesh Uppal
Econ 379: Business and Economics Statistics Instructor: Yogesh Ual Email: yual@ysu.edu Chater 9, Part A: Hyothesis Tests Develoing Null and Alternative Hyotheses Tye I and Tye II Errors Poulation Mean:
More informationSTA 250: Statistics. Notes 7. Bayesian Approach to Statistics. Book chapters: 7.2
STA 25: Statistics Notes 7. Bayesian Aroach to Statistics Book chaters: 7.2 1 From calibrating a rocedure to quantifying uncertainty We saw that the central idea of classical testing is to rovide a rigorous
More informationObjectives. Displaying data and distributions with graphs. Variables Types of variables (CIS p40-41) Distribution of a variable
Objectives Dislaying data and distributions with grahs Variables Tyes of variables (CIS 40-41) Distribution of a variable Bar grahs for categorical variables (CIS 42) Histograms for quantitative variables
More informationMonte Carlo Studies. Monte Carlo Studies. Sampling Distribution
Monte Carlo Studies Do not let yourself be intimidated by the material in this lecture This lecture involves more theory but is meant to imrove your understanding of: Samling distributions and tests of
More informationHypothesis Test-Confidence Interval connection
Hyothesis Test-Confidence Interval connection Hyothesis tests for mean Tell whether observed data are consistent with μ = μ. More secifically An hyothesis test with significance level α will reject the
More informationChapter 7 Sampling and Sampling Distributions. Introduction. Selecting a Sample. Introduction. Sampling from a Finite Population
Chater 7 and s Selecting a Samle Point Estimation Introduction to s of Proerties of Point Estimators Other Methods Introduction An element is the entity on which data are collected. A oulation is a collection
More informationIntroduction to Probability and Statistics
Introduction to Probability and Statistics Chater 8 Ammar M. Sarhan, asarhan@mathstat.dal.ca Deartment of Mathematics and Statistics, Dalhousie University Fall Semester 28 Chater 8 Tests of Hyotheses Based
More informationThe Poisson Regression Model
The Poisson Regression Model The Poisson regression model aims at modeling a counting variable Y, counting the number of times that a certain event occurs during a given time eriod. We observe a samle
More informationWhy Proofs? Proof Techniques. Theorems. Other True Things. Proper Proof Technique. How To Construct A Proof. By Chuck Cusack
Proof Techniques By Chuck Cusack Why Proofs? Writing roofs is not most student s favorite activity. To make matters worse, most students do not understand why it is imortant to rove things. Here are just
More information4. Score normalization technical details We now discuss the technical details of the score normalization method.
SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules
More informationNotes on Instrumental Variables Methods
Notes on Instrumental Variables Methods Michele Pellizzari IGIER-Bocconi, IZA and frdb 1 The Instrumental Variable Estimator Instrumental variable estimation is the classical solution to the roblem of
More informationTests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)
Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant
More information¼ ¼ 6:0. sum of all sample means in ð8þ 25
1. Samling Distribution of means. A oulation consists of the five numbers 2, 3, 6, 8, and 11. Consider all ossible samles of size 2 that can be drawn with relacement from this oulation. Find the mean of
More informationHotelling s Two- Sample T 2
Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test
More informationMA3H1 TOPICS IN NUMBER THEORY PART III
MA3H1 TOPICS IN NUMBER THEORY PART III SAMIR SIKSEK 1. Congruences Modulo m In quadratic recirocity we studied congruences of the form x 2 a (mod ). We now turn our attention to situations where is relaced
More informationMorten Frydenberg Section for Biostatistics Version :Friday, 05 September 2014
Morten Frydenberg Section for Biostatistics Version :Friday, 05 Setember 204 All models are aroximations! The best model does not exist! Comlicated models needs a lot of data. lower your ambitions or get
More informationLecture 1.2 Units, Dimensions, Estimations 1. Units To measure a quantity in physics means to compare it with a standard. Since there are many
Lecture. Units, Dimensions, Estimations. Units To measure a quantity in hysics means to comare it with a standard. Since there are many different quantities in nature, it should be many standards for those
More informationSlides Prepared by JOHN S. LOUCKS St. Edward s s University Thomson/South-Western. Slide
s Preared by JOHN S. LOUCKS St. Edward s s University 1 Chater 11 Comarisons Involving Proortions and a Test of Indeendence Inferences About the Difference Between Two Poulation Proortions Hyothesis Test
More information1 Random Experiments from Random Experiments
Random Exeriments from Random Exeriments. Bernoulli Trials The simlest tye of random exeriment is called a Bernoulli trial. A Bernoulli trial is a random exeriment that has only two ossible outcomes: success
More informationReal Analysis 1 Fall Homework 3. a n.
eal Analysis Fall 06 Homework 3. Let and consider the measure sace N, P, µ, where µ is counting measure. That is, if N, then µ equals the number of elements in if is finite; µ = otherwise. One usually
More information8 STOCHASTIC PROCESSES
8 STOCHASTIC PROCESSES The word stochastic is derived from the Greek στoχαστικoς, meaning to aim at a target. Stochastic rocesses involve state which changes in a random way. A Markov rocess is a articular
More informationJohn Weatherwax. Analysis of Parallel Depth First Search Algorithms
Sulementary Discussions and Solutions to Selected Problems in: Introduction to Parallel Comuting by Viin Kumar, Ananth Grama, Anshul Guta, & George Karyis John Weatherwax Chater 8 Analysis of Parallel
More informationAnnouncements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power.
Announcements Announcements Unit 3: Foundations for inference Lecture 3:, significance levels, sample size, and power Statistics 101 Mine Çetinkaya-Rundel October 1, 2013 Project proposal due 5pm on Friday,
More informationAn Analysis of Reliable Classifiers through ROC Isometrics
An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit
More informationarxiv: v1 [physics.data-an] 26 Oct 2012
Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More informationSoci Data Analysis in Sociological Research. Homework 4 Computer Handout. Chapter 19 Confidence Intervals for Proportions
University of North Carolina Chael Hill Soci252-002 Data Analysis in Sociological Research Sring 2013 Professor François Nielsen Homework 4 Comuter Handout Readings This handout covers comuter issues related
More informationSampling. Inferential statistics draws probabilistic conclusions about populations on the basis of sample statistics
Samling Inferential statistics draws robabilistic conclusions about oulations on the basis of samle statistics Probability models assume that every observation in the oulation is equally likely to be observed
More informationA comparison of two barometers: Nicholas Fortin versus Robert Bosch
Isn t that a daisy? Doc Holliday A comarison of two barometers: Nicholas Fortin versus Robert Bosch Andrew Mosedale I have heard the whisers. I know the rumors. I attend to the gossi. Does it even work?
More informationPlotting the Wilson distribution
, Survey of English Usage, University College London Setember 018 1 1. Introduction We have discussed the Wilson score interval at length elsewhere (Wallis 013a, b). Given an observed Binomial roortion
More informationFeedback-error control
Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller
More informationCOMMUNICATION BETWEEN SHAREHOLDERS 1
COMMUNICATION BTWN SHARHOLDRS 1 A B. O A : A D Lemma B.1. U to µ Z r 2 σ2 Z + σ2 X 2r ω 2 an additive constant that does not deend on a or θ, the agents ayoffs can be written as: 2r rθa ω2 + θ µ Y rcov
More informationTopic 7: Using identity types
Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules
More informationEcon 101A Midterm 2 Th 8 April 2009.
Econ A Midterm Th 8 Aril 9. You have aroximately hour and minutes to answer the questions in the midterm. I will collect the exams at. shar. Show your work, and good luck! Problem. Production (38 oints).
More informationCERIAS Tech Report The period of the Bell numbers modulo a prime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education
CERIAS Tech Reort 2010-01 The eriod of the Bell numbers modulo a rime by Peter Montgomery, Sangil Nahm, Samuel Wagstaff Jr Center for Education and Research Information Assurance and Security Purdue University,
More informationTowards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK
Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)
More informationEconomics 101. Lecture 7 - Monopoly and Oligopoly
Economics 0 Lecture 7 - Monooly and Oligooly Production Equilibrium After having exlored Walrasian equilibria with roduction in the Robinson Crusoe economy, we will now ste in to a more general setting.
More informationStatistics II Logistic Regression. So far... Two-way repeated measures ANOVA: an example. RM-ANOVA example: the data after log transform
Statistics II Logistic Regression Çağrı Çöltekin Exam date & time: June 21, 10:00 13:00 (The same day/time lanned at the beginning of the semester) University of Groningen, Det of Information Science May
More informationChapter 7 Rational and Irrational Numbers
Chater 7 Rational and Irrational Numbers In this chater we first review the real line model for numbers, as discussed in Chater 2 of seventh grade, by recalling how the integers and then the rational numbers
More informationSection 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples
Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means
More informationSupplementary Materials for Robust Estimation of the False Discovery Rate
Sulementary Materials for Robust Estimation of the False Discovery Rate Stan Pounds and Cheng Cheng This sulemental contains roofs regarding theoretical roerties of the roosed method (Section S1), rovides
More informationStatics and dynamics: some elementary concepts
1 Statics and dynamics: some elementary concets Dynamics is the study of the movement through time of variables such as heartbeat, temerature, secies oulation, voltage, roduction, emloyment, rices and
More information15-451/651: Design & Analysis of Algorithms October 23, 2018 Lecture #17: Prediction from Expert Advice last changed: October 25, 2018
5-45/65: Design & Analysis of Algorithms October 23, 208 Lecture #7: Prediction from Exert Advice last changed: October 25, 208 Prediction with Exert Advice Today we ll study the roblem of making redictions
More informationUse of Transformations and the Repeated Statement in PROC GLM in SAS Ed Stanek
Use of Transformations and the Reeated Statement in PROC GLM in SAS Ed Stanek Introduction We describe how the Reeated Statement in PROC GLM in SAS transforms the data to rovide tests of hyotheses of interest.
More informationINTRODUCTION TO ANALYSIS OF VARIANCE
CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two
More informationElementary Analysis in Q p
Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some
More informationSection 9.1 (Part 2) (pp ) Type I and Type II Errors
Section 9.1 (Part 2) (pp. 547-551) Type I and Type II Errors Because we are basing our conclusion in a significance test on sample data, there is always a chance that our conclusions will be in error.
More informationStatistical Inference. Section 9.1 Significance Tests: The Basics. Significance Test. The Reasoning of Significance Tests.
Section 9.1 Significance Tests: The Basics Significance Test A significance test is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to assess.
More informationDownloaded from jhs.mazums.ac.ir at 9: on Monday September 17th 2018 [ DOI: /acadpub.jhs ]
Iranian journal of health sciences 013; 1(): 56-60 htt://jhs.mazums.ac.ir Original Article Comaring Two Formulas of Samle Size Determination for Prevalence Studies Hamed Tabesh 1 *Azadeh Saki Fatemeh Pourmotahari
More informationChemical Kinetics and Equilibrium - An Overview - Key
Chemical Kinetics and Equilibrium - An Overview - Key The following questions are designed to give you an overview of the toics of chemical kinetics and chemical equilibrium. Although not comrehensive,
More information1 Gambler s Ruin Problem
Coyright c 2017 by Karl Sigman 1 Gambler s Ruin Problem Let N 2 be an integer and let 1 i N 1. Consider a gambler who starts with an initial fortune of $i and then on each successive gamble either wins
More informationLast week: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling
More information23. MORE HYPOTHESIS TESTING
23. MORE HYPOTHESIS TESTING The Logic Behind Hypothesis Testing For simplicity, consider testing H 0 : µ = µ 0 against the two-sided alternative H A : µ µ 0. Even if H 0 is true (so that the expectation
More informationNotes on pressure coordinates Robert Lindsay Korty October 1, 2002
Notes on ressure coordinates Robert Lindsay Korty October 1, 2002 Obviously, it makes no difference whether the quasi-geostrohic equations are hrased in height coordinates (where x, y,, t are the indeendent
More informationCMSC 425: Lecture 4 Geometry and Geometric Programming
CMSC 425: Lecture 4 Geometry and Geometric Programming Geometry for Game Programming and Grahics: For the next few lectures, we will discuss some of the basic elements of geometry. There are many areas
More informationPHYS 301 HOMEWORK #9-- SOLUTIONS
PHYS 0 HOMEWORK #9-- SOLUTIONS. We are asked to use Dirichlet' s theorem to determine the value of f (x) as defined below at x = 0, ± /, ± f(x) = 0, - < x
More informationLecture: Condorcet s Theorem
Social Networs and Social Choice Lecture Date: August 3, 00 Lecture: Condorcet s Theorem Lecturer: Elchanan Mossel Scribes: J. Neeman, N. Truong, and S. Troxler Condorcet s theorem, the most basic jury
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation
More informationSection 0.10: Complex Numbers from Precalculus Prerequisites a.k.a. Chapter 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative
Section 0.0: Comlex Numbers from Precalculus Prerequisites a.k.a. Chater 0 by Carl Stitz, PhD, and Jeff Zeager, PhD, is available under a Creative Commons Attribution-NonCommercial-ShareAlike.0 license.
More informationEcon 325: Introduction to Empirical Economics
Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population
More informationSeries Handout A. 1. Determine which of the following sums are geometric. If the sum is geometric, express the sum in closed form.
Series Handout A. Determine which of the following sums are geometric. If the sum is geometric, exress the sum in closed form. 70 a) k= ( k ) b) 50 k= ( k )2 c) 60 k= ( k )k d) 60 k= (.0)k/3 2. Find the
More informationChapter 5: HYPOTHESIS TESTING
MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate
More informationSampling Distributions: Central Limit Theorem
Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)
More informationSampling Distributions
Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Remember sampling? Sampling Part 1 of definition Selecting a subset of the population to create a sample Generally random sampling
More informationE509A: Principle of Biostatistics. GY Zou
E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What
More informationMATH 250: THE DISTRIBUTION OF PRIMES. ζ(s) = n s,
MATH 50: THE DISTRIBUTION OF PRIMES ROBERT J. LEMKE OLIVER For s R, define the function ζs) by. Euler s work on rimes ζs) = which converges if s > and diverges if s. In fact, though we will not exloit
More informationUnit 19 Formulating Hypotheses and Making Decisions
Unit 19 Formulating Hypotheses and Making Decisions Objectives: To formulate a null hypothesis and an alternative hypothesis, and to choose a significance level To identify the Type I error and the Type
More informationHENSEL S LEMMA KEITH CONRAD
HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar
More informationSAS for Bayesian Mediation Analysis
Paer 1569-2014 SAS for Bayesian Mediation Analysis Miočević Milica, Arizona State University; David P. MacKinnon, Arizona State University ABSTRACT Recent statistical mediation analysis research focuses
More informationEstimation of Separable Representations in Psychophysical Experiments
Estimation of Searable Reresentations in Psychohysical Exeriments Michele Bernasconi (mbernasconi@eco.uninsubria.it) Christine Choirat (cchoirat@eco.uninsubria.it) Raffaello Seri (rseri@eco.uninsubria.it)
More informationElliptic Curves Spring 2015 Problem Set #1 Due: 02/13/2015
18.783 Ellitic Curves Sring 2015 Problem Set #1 Due: 02/13/2015 Descrition These roblems are related to the material covered in Lectures 1-2. Some of them require the use of Sage, and you will need to
More informationPrincipal Components Analysis and Unsupervised Hebbian Learning
Princial Comonents Analysis and Unsuervised Hebbian Learning Robert Jacobs Deartment of Brain & Cognitive Sciences University of Rochester Rochester, NY 1467, USA August 8, 008 Reference: Much of the material
More informationChapter 23. Inference About Means
Chapter 23 Inference About Means 1 /57 Homework p554 2, 4, 9, 10, 13, 15, 17, 33, 34 2 /57 Objective Students test null and alternate hypotheses about a population mean. 3 /57 Here We Go Again Now that
More information2. Sample representativeness. That means some type of probability/random sampling.
1 Neuendorf Cluster Analysis Assumes: 1. Actually, any level of measurement (nominal, ordinal, interval/ratio) is accetable for certain tyes of clustering. The tyical methods, though, require metric (I/R)
More informationAn Introduction to Information Theory: Notes
An Introduction to Information Theory: Notes Jon Shlens jonshlens@ucsd.edu 03 February 003 Preliminaries. Goals. Define basic set-u of information theory. Derive why entroy is the measure of information
More informationSTK4900/ Lecture 7. Program
STK4900/9900 - Lecture 7 Program 1. Logistic regression with one redictor 2. Maximum likelihood estimation 3. Logistic regression with several redictors 4. Deviance and likelihood ratio tests 5. A comment
More informationProbability Distributions
CONDENSED LESSON 13.1 Probability Distributions In this lesson, you Sketch the graph of the probability distribution for a continuous random variable Find probabilities by finding or approximating areas
More informationwhere Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.
Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter
More informationECON 4130 Supplementary Exercises 1-4
HG Set. 0 ECON 430 Sulementary Exercises - 4 Exercise Quantiles (ercentiles). Let X be a continuous random variable (rv.) with df f( x ) and cdf F( x ). For 0< < we define -th quantile (or 00-th ercentile),
More informationMachine Learning: Homework 4
10-601 Machine Learning: Homework 4 Due 5.m. Monday, February 16, 2015 Instructions Late homework olicy: Homework is worth full credit if submitted before the due date, half credit during the next 48 hours,
More informationAI*IA 2003 Fusion of Multiple Pattern Classifiers PART III
AI*IA 23 Fusion of Multile Pattern Classifiers PART III AI*IA 23 Tutorial on Fusion of Multile Pattern Classifiers by F. Roli 49 Methods for fusing multile classifiers Methods for fusing multile classifiers
More informationThe Logic of Compound Statements. CSE 2353 Discrete Computational Structures Spring 2018
CSE 2353 Discrete Comutational Structures Sring 2018 The Logic of Comound Statements (Chater 2, E) Note: some course slides adoted from ublisher-rovided material Outline 2.1 Logical Form and Logical Equivalence
More informationCryptanalysis of Pseudorandom Generators
CSE 206A: Lattice Algorithms and Alications Fall 2017 Crytanalysis of Pseudorandom Generators Instructor: Daniele Micciancio UCSD CSE As a motivating alication for the study of lattice in crytograhy we
More informationLast two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals
Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last two weeks: Sample, population and sampling
More informationMultiple Regression Analysis
Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators
More informationappstats8.notebook October 11, 2016
Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus
More informationStatistics for IT Managers
Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample
More informationute measures of uncertainty called standard errors for these b j estimates and the resulting forecasts if certain conditions are satis- ed. Note the e
Regression with Time Series Errors David A. Dickey, North Carolina State University Abstract: The basic assumtions of regression are reviewed. Grahical and statistical methods for checking the assumtions
More informationCENTRAL LIMIT THEOREM (CLT)
CENTRAL LIMIT THEOREM (CLT) A sampling distribution is the probability distribution of the sample statistic that is formed when samples of size n are repeatedly taken from a population. If the sample statistic
More information0.6 Factoring 73. As always, the reader is encouraged to multiply out (3
0.6 Factoring 7 5. The G.C.F. of the terms in 81 16t is just 1 so there is nothing of substance to factor out from both terms. With just a difference of two terms, we are limited to fitting this olynomial
More informationRANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES
RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete
More information