Inference for the mean of a population. Testing hypotheses about a single mean (the one sample t-test). The sign test for matched pairs

Similar documents
Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

MATH Chapter 21 Notes Two Sample Problems

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

INFERENCE FOR REGRESSION

Inferences for Regression

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Inference for Distributions Inference for the Mean of a Population. Section 7.1

STAT Chapter 8: Hypothesis Tests

Homework Example Chapter 1 Similar to Problem #14

ANOVA - analysis of variance - used to compare the means of several populations.

1; (f) H 0 : = 55 db, H 1 : < 55.

Stat 427/527: Advanced Data Analysis I

Chapter 23: Inferences About Means

Chapter 8 of Devore , H 1 :

Chapter 23. Inference About Means

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

10.2: The Chi Square Test for Goodness of Fit

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

9.5 t test: one μ, σ unknown

Review of Statistics 101

Chapter 24. Comparing Means

Interpret Standard Deviation. Outlier Rule. Describe the Distribution OR Compare the Distributions. Linear Transformations SOCS. Interpret a z score

The Components of a Statistical Hypothesis Testing Problem

1 Introduction to Minitab

Chapter 7 Comparison of two independent samples

Sociology 6Z03 Review II

Inference with Simple Regression

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Ch. 7: Estimates and Sample Sizes

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Statistical Inference. Hypothesis Testing

SMAM 314 Practice Final Examination Winter 2003

INTRODUCTION TO ANALYSIS OF VARIANCE

Inference for Distributions Inference for the Mean of a Population

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Chapter 8: Estimating with Confidence

MBA 605, Business Analytics Donald D. Conant, Ph.D. Master of Business Administration

Ch18 links / ch18 pdf links Ch18 image t-dist table

7 Estimation. 7.1 Population and Sample (P.91-92)

Inferences about Means

Statistics for IT Managers

8.1-4 Test of Hypotheses Based on a Single Sample

Exam 2 (KEY) July 20, 2009

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

Unit 10: Simple Linear Regression and Correlation

Dr. Maddah ENMG 617 EM Statistics 10/15/12. Nonparametric Statistics (2) (Goodness of fit tests)

STA Module 11 Inferences for Two Population Means

STA Rev. F Learning Objectives. Two Population Means. Module 11 Inferences for Two Population Means

STA Module 10 Comparing Two Proportions

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Inferential statistics

Performance Evaluation and Comparison

appstats27.notebook April 06, 2017

Correlation & Simple Regression

SMAM 314 Exam 3 Name. F A. A null hypothesis that is rejected at α =.05 will always be rejected at α =.01.

Chapter 3. Measuring data

Chapter Three. Hypothesis Testing

CHAPTER 13: F PROBABILITY DISTRIBUTION

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Chapter 27 Summary Inferences for Regression

The t-statistic. Student s t Test

Example. χ 2 = Continued on the next page. All cells

Chapter 5: HYPOTHESIS TESTING

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Bayesian Models in Machine Learning

Chapter 8 Class Notes Comparison of Paired Samples

Probability and Statistics

Warm-up Using the given data Create a scatterplot Find the regression line

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Continuous Improvement Toolkit. Probability Distributions. Continuous Improvement Toolkit.

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Describing distributions with numbers

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

STAT 328 (Statistical Packages)

The Chi-Square Distributions

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

BEGINNING BAYES IN R. Bayes with discrete models

Chapter 9 Inferences from Two Samples

The Chi-Square Distributions

Regression. Marc H. Mehlman University of New Haven

Statistics For Economics & Business

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Section 10.1 (Part 2 of 2) Significance Tests: Power of a Test

Chapter 8: Estimating with Confidence

Lecture 17: Small-Sample Inferences for Normal Populations. Confidence intervals for µ when σ is unknown

Transcription:

Stat 528 (Autumn 2008) Inference for the mean of a population (One sample t procedures) Reading: Section 7.1. Inference for the mean of a population. The t distribution for a normal population. Small sample CI for µ in a normal population. Robustness of the t procedures. Testing hypotheses about a single mean (the one sample t-test). Methods for matched pairs The paired t-test The sign test for matched pairs The power of the one sample t-test. 1

Inference for the mean of a population So far we have based inference for the population mean on the Z statistic Z = X µ σ/ n. For large n, Z is approximately N(0,1). Problem: in practice we do not know the population standard deviation, σ. Instead we use the sample standard deviation, s, as an estimate for σ. 2

The distribution of t for a normal population Let X 1, X 2,... X n be a SRS from a normal population with population mean µ. Then the standardized variable t = X µ s/ n, has a t distribution with n 1 degrees of freedom (df). The impact of estimating σ is to add uncertainty about our standardization. Smaller n leads to fewer degrees of freedom and less certainty. We say that t has a t n 1 distribution The quantity, s/ n is the (estimated) standard error for the sample mean. It is denoted SE mean in MINITAB. 3

Properties of the t distribution probability density 0.0 0.1 0.2 0.3 0.4 standard normal t with 5 df t with 2 df t with 1 df 4 2 0 2 4 value The density curve is symmetric with mean zero and is bellshaped like the normal distribution. The t distribution has heavier tails than the normal distribution (more spread out about zero). As the degrees of freedom increase the tails become thinner, and more of the density is concentrated in the center of the distribution. t = standard normal distribution. 4

A small sample CI for µ (The normal population case) For one random sample of normal data, a C = 100(1 α)% level confidence interval for µ is given by x ± t n 1,α/2 s n, where t n 1,α/2 is the critical value of the t distribution with n 1 degrees of freedom. The t n 1,α/2 value is tabulated in Table D. 1. Look at the bottom of the table for the confidence level C of the two sided interval, OR 2. Look up α/2 as the upper tail probability p. Recall that the CI for µ comes from a family of hypothesis tests about µ. 5

Robustness of the t-procedures What if the population is not normal can we still use the t distribution? Practical guidelines from the textbook: 1. n < 15: Use t procedures if data are close to normal. If data are clearly non-normal or if outliers are present, do not use the t procedure. 2. n 15: Use t procedures except in presence of strong skewness or outliers. 3. Roughly n 40: The t procedures are valid even for clearly skewed distributions. Use plots of the data to help you decide! 6

Polymerization example The article Measuring and understanding the aging of craft insulating paper in power transformers contained the following observations on the degree of polymerization for paper specimens for which viscosity times concentration fell in a certain middle range. 418 421 421 422 425 427 431 434 437 439 446 447 448 453 454 463 465 Plots of the data show that a normality assumption for the data is reasonable. (Note that x = 438.29, s = 15.14, n = 17). Form a 95% confidence interval for the true average degree of polymerization (as did the authors of the article). Does the interval suggest that 440 is a plausible value for the true average degree of polymerization? What about 450? 7

Testing hypotheses about a single mean The one sample t test Data: We assume x 1, x 2,...x n is a random sample from a normal population with mean µ. We state our hypotheses: H 0 : µ = µ 0, for some constant value µ 0 H a : µ < µ 0, µ µ 0, OR µ > µ 0 (remember to define what µ is (in words) for your problem). We calculate the test statistic, t = x µ 0 s/ n. Under H 0, the test statistic follows a t n 1 distribution. Decision: Compare the observed t-statistic to the critical value found in Table D. 8

Drawing conclusions in the one-sample t-test For a test of significance at the level α If the observed t-statistic is in the tail, we reject H 0 (in favor of H A ). If the observed t-statistic is not in the tail, we do not reject H 0. Alternatives and tails For a two-tailed alternative, reject if t t α/2. For an upper-tailed alternative, reject if t t α. For a lower-tailed alternative, reject if t t α. As always, write your conclusion(s) in words. It is important to think about the assumptions that you made to carry out the t-test. Remember that some assumptions can be validated using plots of the data. 9

Example The one-sample t statistic from a sample of n = 50 observations for the two-sided test of H 0 : µ = 50 versus H a : µ 50, has the value t = 1.65. What are the degrees of freedom for the test statistic, t? Is the value t = 1.65 statistically significant at the 10% level? At the 5% level? Locate the two critical values, t from Table D that bracket t. What are the right-tail probabilities for these two values? How would you report the P-value for this test? 10

Matched pairs (revision and analysis) Suppose we have two treatments. In the matched pairs design we try to gain precision in the response by matching pairs of similar individuals. we assign each treatment randomly to each subject (each subject only receives one treatment). Or an individual serves as his/her own partner. the individual receives both treatments. Each pair of subjects (individual) form their own block. To analyze the results of this type of experiment, we compare the responses across the pairs (individuals). We usually take differences, and carry out the statistical inference using the paired t-test. 11

Football example Two identical footballs, one air-filled and one helium-filled, were used outdoors on a windless day at The Ohio State University s athletic complex. The kicker was a novice punter and was not informed which football contained the helium. Each football was kicked 39 times. The kicker changed footballs after each kick so that his leg would play no favorites if he tired or improved with practice. (Source: Lafferty, M. B. (1993), OSU scientists get a kick out of sports controversy, The Columbus Dispatch (21 Nov 1993), B7.) 12

The data (all distances are in yards) Trial Air Helium 1 25 25 2 23 16 3 18 25 4 16 14 5 35 23 6 15 29 7 26 25 8 24 26 9 24 22 10 28 26 11 25 12 12 19 28 13 27 28 Trial Air Helium 14 25 31 15 34 22 16 26 29 17 20 23 18 22 26 19 33 35 20 29 24 21 31 31 22 27 34 23 22 39 24 29 32 25 28 14 26 29 28 Trial Air Helium 27 22 30 28 31 27 29 25 33 30 20 11 31 27 26 32 26 32 33 28 30 34 32 29 35 28 30 36 25 29 37 31 29 38 28 30 39 28 26 13

A scatterplot 14

The paired t procedure the setup Suppose we have pairs of data values (x 1, y 1 ), (x 2, y 2 ),... (x n, y n ). e.g., In our example the pairs of values are the (helium-filled, air-filled) distances for each kick. Clearly the x and y values are not independent. Instead, we calculate the differences d i = y i x i, for each i = 1,..., n. We assume d 1, d 2,... d n is a random sample from a normal population with mean µ d and stdev σ d. µ d is the population mean of the differences between the x and y values. σ d is the population stdev of the differences. 15

The paired t procedure We want to test: H 0 : µ d = µ 0, for some constant value µ 0 H a : µ d < µ 0, µ d µ 0, OR µ d > µ 0 We compute the test statistic, t = d µ 0 s d / n, where d is the sample average of the differences, and s d is the sample stdev of the differences. Under H 0, the test statistic follows a t n 1 distribution. We make our decision in the same way that we did for the one-sample t-test. if the observed t-statistic is in tail, we reject H 0, if the observed t-statistic is not in the tail, we do not reject H 0. 16

Identifying the hypotheses There is a belief that on average a helium-filled ball travels further than the air-filled ball. State the appropriate H 0 and H a. Be sure to identify the parameters appearing in the hypotheses. 17

Summary figures 18

Performing the test Carry out a test. Can you reject H 0 at the 5% significance level? At the 1% significance level? Write down you conclusion in words. Variable N N* Mean SE Mean StDev Air-Helium 39 0-0.462 1.10 6.87 Variable Minimum Q1 Median Q3 Maximum Air-Helium -17.00-4.00-1.00 2.00 14.00 Provide a 90% confidence interval for the mean difference in the distances (air-filled minus helium-filled). 19

Inference for non-normal populations If the data do not seem to be drawn from a normal population, then the t procedures may not be valid. Three possible strategies: 1. Learn about other probability distributions. For example, there plenty of skewed distributions (e.g, exponential, gamma, Weibull). Use methods for these distributions instead of the methods for the normal distribution. 2. Transform your data to make it look as normal as possible (recall the ladder of power transformations). Can be hard to interpret the results when using a transformation. 3. Use distribution-free tests. These tests do not assume a particular distribution for the population. Often these test are based on other parameters of the distribution such as the median (rather than the mean). These tests can be less powerful in practice. 20

The sign test for matched pairs Example of a distribution-free test. As before, consider pairs of data values: (x 1, y 1 ), (x 2, y 2 ),... (x n, y n ). We will test H 0 : population median of differences = 0, versus H a : population median of differences 0. Let d i = y i x i (i = 1,..., n) be the differences. Exclude the differences that are zero. Let X denote the count out of the remaining m differences that are positive. Then under H 0, X is Binomial(m,0.5). (If the median is zero, then half the nonzero differences are above zero, and the other half are below zero). If x is the observed X value, then the P-value is 2 P(X x) or 2 P(X x). 21

The sign test for matched pairs (cont.) For the football example: Out of n = 39 differences, m = 37 differences are nonzero. Thus under H 0, X is Binomial(37, 0.5). Out of the 37, we observe 17 that are above zero. P-value = 2 P(X 17) = 2 0.3714 = 0.7428. No evidence to reject H 0. See the textbook for the one-sided test. Note: If the population of differences is normally (or approximately normally) distributed then this test will be less powerful at detecting differences than the paired t-test. 22

The power of the one sample t-test The power calculation for the one sample t-test is similar to the power calculation for the z-test. But, the math is much harder! Instead we use MINITAB. Stat Power and Sample Size 1-Sample t. Under Options select the Alternative Hypothesis and Significance Level Then enter any two of the following three items: 1. Sample sizes: 2. Differences: 3. Power values: Enter the Standard deviation (the sample stdev in this case) and click OK. 23

A value for σ There are four main ways to obtain a value for σ. Literature search. Use historical data from similar studies. Pilot study. Use the results of a pilot study. The estimate of σ will often need to be adjusted. Elicit σ. Two useful methods are the Range/4 method and the Range/6 method. Construct a value for σ. Some probability models yield a value for σ. (e.g. For a Bernoulli RV, σ = p(1 p)). Be conservative. Use several methods and consider a slightly larger value of σ than these methods suggest. 24

An agricultural field trial example An agricultural field trial compares the yield of two varieties of tomatoes for commercial use. The researchers divide in half each of 10 small plots of land and plant each tomato variety on one half of each plot. After harvest, they compare the yields in pounds per plant at each location. The ten differences (Variety A - Variety B) give the following statistics: x = 0.46 and s = 0.92. Is there convincing evidence that Variety A has the higher mean yield? Let µ d denote the population mean of the difference in the yields. We test: H 0 : µ d = 0 versus H a : µ d > 0. The MINITAB output for the paired t test is: One-Sample T, Test of mu = 0 vs > 0 95% Lower N Mean StDev SE Mean Bound T P 10 0.460000 0.920000 0.290930-0.073307 1.58 0.074 25

Agricultural trial (cont.) The tomato experts who carried out the field trial suspect that the relative lack of significance is due to low power. They would like to detect a mean difference in yields of 0.6 pounds per plant at the 0.05 significance level. Based on the previous study, use 0.92 as an estimate of the population σ. What is the power of the test with n = 12 against the alternative of µ = 0.6? If the sample size is increased to n = 30 plots of land, what will be the power against the same alternative? 26