Study Guide for Sample Size and Power

Learning Model Principles Study Guide for Sample Size and Power Learners and teachers at BYU Idaho... 1. Exercise faith in the Lord Jesus Christ as a principle of action and power 2. Understand that true teaching is done by and with the Holy Ghost 3. Lay hold upon the word of God as found in the holy scriptures and in the words of the Prophets in all disciplines 4. Act for themselves and accept responsibility for learning and teaching 5. Love, serve, and teach one another Prepare When doing a significance test, there are four possible general outcomes of the test that are summarized in the table below. Truth about the Population H o True H a True Decision Based on Reject H o Type I Error, α Correct Decision, 1 β Sample Fail to Reject H o Correct Decision, 1 α Type II Error, β The probabilities of Type I and Type II Errors (α and β) were covered earlier in the course. The probability of a Type I error is the probability of rejecting H o when H o is true. The probability of a Type II error is the probability of not rejecting H o when H a is true. As one type of error decreases the other has to increase, with everything else being equal. Referring to the table above, the probability of rejecting H o, when H a is true, is 1 β, which is the complement of the probability of a Type II error. The probability of 1 β is referred to as the power of the test. The higher the power of the test, the more likely the significance test will reject H o when H a is true. Industry standards typically require that the power of the test be at least 0.8. To calculate the power of a test, there are four components that are required: 1) The level of significance, α (typically between 0.01 and 0.1, this course has mostly used 0.05) 2) The sample size, n 3) The effect size, how large the difference between the null and alternative hypothesis is important in practice 4) The standard deviation (this is more difficult to change in practice)

For this module, it will not be required to calculate mathematically the power of a test based on the components above. An applet instead will be used to calculate the power of a one sample test and show the relative changes based on changes of each of the four components above. The web address for the applet is the following: http://bcs.whfreeman.com/ips5e/ Under Student Tools click on Statistical Applets On the next web page, scroll down and click where it says Power of a Test The first three paragraphs introduce how the applet works. Please read these before proceeding with the applet. Let s use an example below to illustrate how to calculate power of a test. Example Tasters rated the sweetness of new colas on a 10 point scale before and after storage, so that we have each taster s judgment of loss of sweetness. From experience, we know that sweetness loss scores vary from taster to taster according to a Normal Distribution with a standard deviation σ = 1.5. To see if the taste test gives reason to think that the cola does lose sweetness, we will test a one sided test of hypothesis: H o : µ=0 H a : µ<0 We will first use a level of significance of α = 0.05, and a sample size of 10 tasters. Let s assume that the mean sweetness loss is 0.8 on a ten point scale will be noticed by consumers and that will be the effect size [1]. Use this information into the applet.

The red orange area under the second curve represents the power of the test which is 0.512, which is less than typical industry standard which is 0.8. There are a few things one can do to increase the power of the test. Let s first increase the level of significance from α=0.05 to 0.10.

When the level of significance is increased to 0.10, the power of the test increases to 0.655. Let s increase the effect size by changing the effect size from 0.8 to 1.5 by changing alt µ to 1.5 (level of significance is put back to 0.05). The power is at 0.933 which is above 0.8. This method to increase power would only work if the difference of 1.5 is important in practice. If feasible, the most effective way to increase the power of the test is to increase the sample size. If we kept all other factors to their original value and increase the sample size to 25, the power of the test would be 0.844 which would be sufficient for the statistical analysis (see Figure below)

Investigation Do the following (use the applet to help you through the problems). 1) Explain what happens to the power of a test when decreasing the level of significance? 2) Explain what happens to the probability of a Type II error when decreasing the level of significance? 3) Explain what happens to the power of a test when decreasing the effect size? 4) Explain what happens to the power of the test when you increase the sample size? 5) What is the most effective component to change when increasing the power of the test? Please explain. Ponder/Prove Please complete the following two exercises. 1) Emissions of sulfur dioxide by industry set off chemical changes in the atmosphere that result in acid rain. The acidity (or basicity) of liquids is measured by ph on a scale of 0 to 14. Distilled water has ph 7.0, and lower ph indicates acidity. Typical rain is somewhat acidic, so acid rain is defined as rainfall with a ph below 5.0. Suppose that ph measurements of rainfall on different days in a Canadian forest follow a normal distribution with standard deviation σ=0.5. You plan to test the hypotheses H o : µ=5 H a : µ<5

at the level of significance of 0.05. You will want to use a test that will almost always reject H o when the true mean ph is 4.7 (meaning the effect size is 0.3). Use the applet to find the power against the alternative µ=4.7 for sample sizes n=5, n=15 and n=40. What happens to the power as the sample size increases? Which of these sample sizes are adequate for use in this setting? What would be the minimum required sample size to reach a power of the test equal to 0.8 [1]? 2) A study was done to determine the efficacy of asking people to add fiber, exercise and stress management to their lifestyle. There were several measurements to determine the effect of changing their lifestyle. One of those is VO 2 max. The mean VO 2 max for those who exercise fewer than three days a week is 43.0 ml/kg/min. VO 2 max is also known to be normally distributed with σ=9. You would like to see if changing people s lifestyle would increase their VO 2 max. For your analysis, you would like to have an effect size of 5.0 ml/kg/min, a level of significance of α=0.05. What would be the minimum required sample size to reach a power of the test equal to 0.8? References [1] Baldi, B., & Moore, D. (2009). The Practice of Statistics in the Life Sciences (2nd ed.). W. H. Freeman and Company [2] Masley, S.C., Weaver, W. Peri, G. & Phillips, S.E (2008). Efficacy of Lifestyle Changes in Modifying Practical Markers of Wellness and Aging. Alternative Therapies, 14(2),24 29.