Similar documents
2. Tests in the Normal Model

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

Hypothesis Testing Chap 10p460

Introductory Econometrics

Visual interpretation with normal approximation

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

F79SM STATISTICAL METHODS

3. Tests in the Bernoulli Model

Political Science 236 Hypothesis Testing: Review and Bootstrapping

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

LECTURE 5 HYPOTHESIS TESTING

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

6.4 Type I and Type II Errors

Review. December 4 th, Review

5. Likelihood Ratio Tests

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Statistical Inference. Hypothesis Testing

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Tests about a population mean

STAT 830 Hypothesis Testing

Study Ch. 9.3, #47 53 (45 51), 55 61, (55 59)

Summary of Chapters 7-9

Hypothesis Testing - Frequentist

Chapter 7 Comparison of two independent samples

Hypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses

MATH 240. Chapter 8 Outlines of Hypothesis Tests

Statistical Inference

Topic 10: Hypothesis Testing

Hypotheses Test Procedures. Is the claim wrong?

STAT 830 Hypothesis Testing

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

A3. Statistical Inference

Basic Concepts of Inference

Hypothesis testing (cont d)

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

Introduction to Statistics

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

Statistical Inference


CH.9 Tests of Hypotheses for a Single Sample

14.30 Introduction to Statistical Methods in Economics Spring 2009

Topic 15: Simple Hypotheses

Chapter 7: Hypothesis Testing

Hypothesis Testing. ECE 3530 Spring Antonio Paiva

Lecture 13: p-values and union intersection tests

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

ECE531 Screencast 11.4: Composite Neyman-Pearson Hypothesis Testing

hypothesis a claim about the value of some parameter (like p)

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Psychology 282 Lecture #4 Outline Inferences in SLR

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Chapter 8 of Devore , H 1 :

Inferences About Two Proportions

Mathematical Statistics

Econ 325: Introduction to Empirical Economics

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Performance Evaluation and Comparison

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

Introductory Econometrics. Review of statistics (Part II: Inference)

Mathematical statistics

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Lecture 21. Hypothesis Testing II

Chapter 5: HYPOTHESIS TESTING

Ch. 5 Hypothesis Testing

Section 9.1 (Part 2) (pp ) Type I and Type II Errors

Chapter 7. Hypothesis Testing

Institute of Actuaries of India

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Topic 10: Hypothesis Testing

Topic 17 Simple Hypotheses

A3. Statistical Inference Hypothesis Testing for General Population Parameters

8.1-4 Test of Hypotheses Based on a Single Sample

Power Analysis. Thursday, February 23, :56 PM. 6 - Testing Complex Hypotheses and Simulation Page 1

IEOR165 Discussion Week 12

EXAM 3 Math 1342 Elementary Statistics 6-7

Mathematical statistics

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Introduction to Statistical Hypothesis Testing

Single Sample Means. SOCY601 Alan Neustadtl

Charles Geyer University of Minnesota. joint work with. Glen Meeden University of Minnesota.

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

STAT Chapter 8: Hypothesis Tests

12.10 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS

1 Statistical inference for a population mean

SUFFICIENT STATISTICS

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

Econometrics. 4) Statistical inference

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Chapter Three. Hypothesis Testing

Transcription:

http://www.math.uah.edu/stat/hypothesis/.xhtml 1 of 5 7/29/2009 3:14 PM Virtual Laboratories > 9. Hy pothesis Testing > 1 2 3 4 5 6 7 1. The Basic Statistical Model As usual, our starting point is a random experiment with an underlying sample space and a probability measure P. In the basic statistical model, we have an observable random variable X taking values in a set S. In general, X can have quite a complicated structure. For example, if the experiment is to sample n objects from a population and record various measurements of interest, then X = (X 1, X 2,..., X n ) where X i is the vector of measurements for the i th object. The most important special case occurs when (X 1, X 2,..., X n ) are independent and identically distributed. In this case, we have a random sample of size n from the common distribution. The purpose of this section is to define and discuss the basic concepts of statistical hypothesis testing. Collectively, these concepts are sometimes referred to as the Neyman-Pearson framework, in honor of Jerzy Neyman and Egon Pearson, who first formalized them. General Hypothesis Tests A statistical hypothesis is a statement about the distribution of the data variable X. Equivalently, a statistical hypothesis specifies a set of possible distributions of X (namely, the set of distributions for which the statement is true). In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis. The null hypothesis is usually denoted H 0 while the alternative hypothesis is usually denoted H 1. A hypothesis that specifies a single distribution for X is called simple; a hypothesis that specifies more than one distribution for X is called composite. An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor of the alternative, or to fail to reject the null hypothesis. The decision that we make must, of course, be based on the data vector X. Thus, we will find a subset R of the sample space S and reject H 0 if and only if X R. The set R is known as the rejection region or the critical region. Note the asymmetry between the null and alternative hypotheses. This asymmetry is due to the fact that we assume the null hypothesis, in a sense, and then see if there is sufficient evidence in X to overturn this assumption in favor of the alternative. Often, the critical region is defined in terms of a statistic W(X), known as a test statistic. As usual, the use of a statistic allows data reduction when the dimension of the statistic is much smaller than the dimension of the data vector.

http://www.math.uah.edu/stat/hypothesis/.xhtml 2 of 5 7/29/2009 3:14 PM Errors The ultimate decision may be correct or may be in error. There are two types of errors, depending on which of the hypotheses is actually true: 1. 2. A type 1 error is rejecting the null hypothesis when it is true. A type 2 error is failing to reject the null hypothesis when it is false. Similarly, there are two ways to make a correct decision: we could reject the null hypothesis when it is false or we could fail to reject the null hypothesis when it is true. The possibilities are summarized in the following table: Hypothesis Test State/Decision Fail to reject H 0 Reject H 0 H 0 True Correct Type 1 error False Type 2 error Correct If H 0 is true (that is, the distribution of X is specified by H 0 ), then P(X R) is the probability of a type 1 error for this distribution. If H 0 is composite, then H 0 specifies a variety of different distributions for X and thus there is a set of type 1 error probabilities. The maximum probability of a type 1 error is known as the significance level of the test or the size of the critical region, which we will denote by α. Usually, the rejection region is constructed so that the significance level is a prescribed, small value (typically 0.1, 0.05, 0.01). If H 1 is true (that is, the distribution of X is specified by H 1 ). then P(X R) is the probability of a type 2 error for this distribution. Again, if H 1 is composite then H 1 specifies a variety of different distributions for X. and thus there will be a set of type 2 error probabilities. Generally, there is a tradeoff between the type 1 and type 2 error probabilities. If we reduce the probability of a type 1 error, by making the rejection region R smaller, we necessarily increase the probability of a type 2 error because the complementary region S R is larger. Power If H 1 is true (that is, the distribution of X is specified by H 1 ), then P(X R), the probability of rejecting H 0 (and thus making a correct decision), is known as the power of the test for the distribution. Suppose that we have two tests, corresponding to rejection regions R 1 and R 2, respectively, each having significance level α. The test with region R 1 is uniformly more powerful than the test with region R 2 if P(X R 1 ) P(X R 2 ) for any distribution of X specified by H 1

http://www.math.uah.edu/stat/hypothesis/.xhtml 3 of 5 7/29/2009 3:14 PM Naturally, in this case, we would prefer the first test. Often, however, two tests will not be uniformly ordered; one test will be more powerful for some distributions specified by H 1 while the other test will be more powerful for other distributions specified by H 1. Finally, if a test has significance level α and is uniformly more powerful than any other test with significance level α. then the test is said to be a uniformly most powerful test at level α. Clearly, such a test is the best we can do. P-value In most cases, we have a general procedure that allows us to construct a test (that is, a rejection region R α ) for any given significance level α ( 0, 1). Typically, R α decreases (in the subset sense) as α decreases. In this context, the P-value of the data variable X. denoted P(X) is defined to be the smallest α for which X R α ; that is, the smallest significance level for which H 0 is rejected, given X. Knowing P(X) allows us to test H 0 at any significance level, for the given data: If P(X) α then we would reject H 0 at significance level α; if P(X) > α then we fail to reject H 0 at significance level α. Note that P(X) is a statistic. Tests of an Unknown Parameter Hypothesis testing is a very general concept, but an important special class occurs when the distribution of the data variable X depends on a parameter θ. taking values in a parameter space Θ. The parameter may be vector-valued, so that θ = (θ 1, θ 2,..., θ k ) and Θ R k for some k. The hypotheses generally take the form H 0 : θ Θ 0 versus H 1 : θ Θ 0 where Θ 0 is a prescribed subset of the parameter space Θ. In this setting, the probabilities of making an error or a correct decision depend on the true value of θ. If R is the rejection region, then the power function is given by Q(θ) = P θ (X R), θ Θ 1. Show that Q(θ) is the probability of a type 1 error when θ Θ o max {Q(θ) : θ Θ 0 } is the significance level of the test. 2. Show that 1 Q(θ) is the probability of a type 2 error when θ Θ 0. Q(θ) is the power of the test when θ Θ 0. Suppose that we have two tests, corresponding to rejection regions R 1 and R 2, respectively, each having significance level α. The test with rejection region R 1 is uniformly more powerful than the test with rejection

http://www.math.uah.edu/stat/hypothesis/.xhtml 4 of 5 7/29/2009 3:14 PM region R 2 if Q 1 (θ) Q 2 (θ), θ Θ 0 M ost hypothesis tests of an unknown real parameter θ fall into three special cases: 1. 2. 3. H 0 : θ = θ 0 versus H 1 : θ θ 0 H 0 : θ θ 0 versus H 1 : θ < θ 0 H 0 : θ θ 0 versus H 1 : θ > θ 0 where θ 0 is a specified value. Case 1 is known as the two-sided test; case 2 is known as the left-tailed test, and case 3 is known as the right-tailed test (named after the conjectured alternative). There may be other unknown parameters besides θ (known as nuisance parameters). Equivalence Between Hypothesis Test and Confidence S ets There is an equivalence between hypothesis tests and confidence sets for a parameter θ. 3. Suppose that C(X) is a 1 α level confidence set for θ. Show that the test below has significance level α for the hypothesis H 0 : θ = θ 0 versus H 1 : θ θ 0 : Reject H 0 if and only if θ 0 C(X) equivalently, we fail to reject H 0 at significance level α if and only if θ 0 is in the corresponding 1 α level confidence set. 4. In particular, show that this equivalence applies to interval estimates of a real parameter θ and the common tests for θ. In each case below, the confidence interval has confidence level 1 α and the test has significance level α c. Suppose that ( L(X), U(X) ) is a two-sided confidence interval for θ. Reject H 0 : θ = θ 0 versus H 1 : θ θ 0 if and only if θ 0 L(X) or θ 0 U(X) Suppose that L(X) is a confidence lower bound for θ. Reject H 0 : θ θ 0 versus H 1 : θ < θ 0 if and only if θ 0 L(X) Suppose that U(X) is a confidence upper bound for θ. Reject H 0 : θ θ 0 versus H 1 : θ > θ 0 if and only if θ 0 U(X) Pivot Variables and Test S tatistics Recall that confidence sets of an unknown parameter θ are often constructed through a pivot variable, that is, a random variable W(X, θ) that depends on the data vector X and the parameter θ. but whose distribution does not depend on θ. In this case, a natural test statistic is W(X, θ 0 ).

http://www.math.uah.edu/stat/hypothesis/.xhtml 5 of 5 7/29/2009 3:14 PM Virtual Laboratories > 9. Hy pothesis Testing > 1 2 3 4 5 6 7 Contents Applets Data Sets Biographies External Resources Key words Feedback