Chapters 10. Hypothesis Testing

Similar documents
Chapters 10. Hypothesis Testing

Hypothesis Testing Chap 10p460

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Topic 15: Simple Hypotheses

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

STAT 830 Hypothesis Testing

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

STAT 830 Hypothesis Testing

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

STAT 801: Mathematical Statistics. Hypothesis Testing

Statistics 135 Fall 2008 Final Exam

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

6.4 Type I and Type II Errors

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Master s Written Examination - Solution

exp{ (x i) 2 i=1 n i=1 (x i a) 2 (x i ) 2 = exp{ i=1 n i=1 n 2ax i a 2 i=1

F79SM STATISTICAL METHODS

Introductory Econometrics. Review of statistics (Part II: Inference)

Lecture 12 November 3

Hypothesis Testing: The Generalized Likelihood Ratio Test

Hypothesis Testing. BS2 Statistical Inference, Lecture 11 Michaelmas Term Steffen Lauritzen, University of Oxford; November 15, 2004

simple if it completely specifies the density of x

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Math 494: Mathematical Statistics

Methods for Statistical Prediction Financial Time Series I. Topic 1: Review on Hypothesis Testing

Exam 2 Practice Questions, 18.05, Spring 2014

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Solution: First note that the power function of the test is given as follows,

Ch. 5 Hypothesis Testing

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Problems ( ) 1 exp. 2. n! e λ and

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Probability & Statistics - FALL 2008 FINAL EXAM

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

Class 19. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Chapter 7: Hypothesis Testing

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Lecture 16 November Application of MoUM to our 2-sided testing problem

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

Contents. 22S39: Class Notes / October 25, 2000 back to start 1

Summary of Chapters 7-9

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Introductory Econometrics

Section 6.2 Hypothesis Testing

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Non-parametric Inference and Resampling

1. Let A be a 2 2 nonzero real matrix. Which of the following is true?

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Frequentist Statistics and Hypothesis Testing Spring

DA Freedman Notes on the MLE Fall 2003

Mathematical Statistics

Econ 2140, spring 2018, Part IIa Statistical Decision Theory

Probability and Statistics qualifying exam, May 2015

Topic 10: Hypothesis Testing

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

Hypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis

14.30 Introduction to Statistical Methods in Economics Spring 2009

Hypothesis testing (cont d)

Chapter 9: Hypothesis Testing Sections

Chapters 9. Properties of Point Estimators

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

The normal distribution Mixed exercise 3

Chapter 10. Hypothesis Testing (I)

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Topic 10: Hypothesis Testing

Institute of Actuaries of India

8 Testing of Hypotheses and Confidence Regions

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Fundamental Probability and Statistics

Courtesy of Jes Jørgensen

Contents 1. Contents

A3. Statistical Inference

Nuisance parameters and their treatment

Testing Statistical Hypotheses

18.05 Practice Final Exam

3. Tests in the Bernoulli Model

Chapter 6. Hypothesis Tests Lecture 20: UMP tests and Neyman-Pearson lemma

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

Lecture 8: Information Theory and Statistics

Hypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Stat 579: Generalized Linear Models and Extensions

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

Statistical Data Analysis Stat 3: p-values, parameter estimation

CS1512 Foundations of Computing Science 2. Lecture 4

Math 628 In-class Exam 2 04/03/2013

Transcription:

Chapters 10. Hypothesis Testing

Some examples of hypothesis testing 1. Toss a coin 100 times and get 62 heads. Is this coin a fair coin? 2. Is the new treatment more effective than the old one? 3. Quality control.

4. Is there any sex discrimination? A female pharmacist at Lagranze Phar. filed lawsuit against the company, complaining of sex discrimination. Her lawyer provided the following table as the evidence of discrimination. The data contains 2 females and 24 males. Months to Promotion F 453 229 M 47 192 14 12 14 5 37 7 68 483 34 19 25 125 34 22 25 64 14 23 21 67 47 24 The data is re-organized in terms of ranks, the shortest time to promotion getting rank 1, the 2nd shortest time getting rank 2, and so on Months to Promotion Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 Sex M M M M M M M M M M M M M Rank 14 15 16 17 18 19 20 21 22 23 24 25 26 Sex M M M M M M M M M M F F M

Basic setup of hypothesis testing Population parameters of interest θ (unknown). Samples collected from experiment or observation {X 1, X 2,..., X n }. Hypothesis Testing. 1. Null Hypothesis and Alternative Hypothesis. H 0 : θ Θ 0, H a : θ Θ a. For example H 0 : θ = 0.5, H a : θ 0.5 H 0 : θ = 0.5, H a : θ > 0.5 H 0 : θ < 1, H a : θ > 2 2. Test statistics a function of the samples, say T = T (X 1, X 2,..., X n ). 3. Rejection region (RR). (a) When T RR, reject H 0 and accept H a. (b) When T RR, accept H 0.

Type I error, Type II error and Power of a test Consider the simple hypotheses H 0 : θ = θ 0, H a : θ = θ 1, where θ 0, θ 1 are given constants. 1. Type I error. α =. P (Reject H 0 H 0 is true) = P (T RR θ = θ 0 ) 2. Type II error. β =. P (Accept H 0 H a is true) = P (T RR θ = θ 1 ) 3. Power. P (Reject H 0 H a is true) = 1 β.

Example Consider the following hypothesis testing. X 1, X 2,..., X n are iid from N(µ, 1). H 0 : µ = 0, H a : µ = 1. Suppose T = X and the rejection region is RR =. {x : x > 0.5}. 1. Type I error: α = P ( X > 0.5 µ = 0) = P ( n X > 0.5 n) = Φ( 0.5 n). 2. Type II error: β = P ( X 0.5 µ = 1) = P ( n[ X 1] 0.5 n) = Φ( 0.5 n). 3. Power: 1 β = 1 Φ( 0.5 n) = Φ(0.5 n).

Remark: 1. The ideal scenario is that both α and β are small. But α and β are in conflict. 2. Increasing the sample size will reduce both types of errors and increase the power the test. 3. Type I error is more often called the significance level of the test. 4. When θ 0 and θ 1 are closer, the power of the test will decrease. 5. Usually we are looking for sufficient evidence to reject H 0. Thus type I error is more important than the type II error. Consequently, one usually control the type I error below some pre-assigned small threshold, and then, subject to this control, look for a test which maximize the power (or minimize the type II error).

Remark: All the previous definitions and discussions extend to composite hypotheses H 0 : θ Θ 0, H a : θ Θ a, where 1. Type I error: for θ 0 Θ 0, 2. Type II error: for θ a Θ a, α(θ 0 ) = P (T RR θ = θ 0 ) β(θ a ) = P (T RR θ = θ a ) 3. Power = 1 β(θ a ).

Testing the mean of normal distribution Suppose X 1, X 2,..., X n are iid from N(µ, σ 2 ) with σ 2 known but µ unknown. Consider the following types of one-sided tests and two-sided test. In all three cases, the test statistics is [1]. H 0 : µ = µ 0, H a : µ > µ 0 [2]. H 0 : µ = µ 0, H a : µ < µ 0 [3]. H 0 : µ = µ 0, H a : µ µ 0 T = X = 1 n (X 1 + X 2 + + X n ). We also assume that the type I error (significance level) is fixed to be a preassigned small number α (usually α = 0.05).

[1]. H 0 : µ = µ 0, H a : µ > µ 0 The rejection region is of the form for some k. RR = { X > k} Determine k. α = Type I error = P ( X > k µ = µ 0 ). But X is N(µ, σ 2 /n). Therefore k = µ 0 + σ n z α = µ 0 + σ Xz α.

[2]. H 0 : µ = µ 0, H a : µ < µ 0 The rejection region is of the form for some k. RR = { X < k} Determine k. α = Type I error = P ( X < k µ = µ 0 ). k = µ 0 σ n z α = µ 0 σ Xz α..

[3]. H 0 : µ = µ 0, H a : µ µ 0 The rejection region is of the form RR = { X < k 1 } { X > k 2 } for some k. Determine k. α = Type I error = P ( X < k 1 µ = µ 0 ) + P ( X > k 2 µ = µ 0 ). Symmetry. k 1 = µ 0 σ n z α/2 = µ 0 σ Xz α/2 k 2 = µ 0 + σ n z α/2 = µ 0 + σ Xz α/2.

Example 1. National student exam scores are distributed as N(500, 100 2 ). In a classroom of 25 freshmen, the mean score was 472. Is the freshmen of below average performance? (Consider different cases with the significance level α = 0.1, 0.05, 0.01) Reject H 0 at level α = 0.1, Accept H 0 at level α = 0.05, 0.01.

2. In a two-sided test of H 0 : µ = 80 in a normal population with σ = 15, an investigator reported that since X = 71.9, the null hypothesis is rejected at 1% level. What can we say about the sample sized used? n 23

Extension to Large Sample Tests The test for normal distributions easily extend to large sample tests where the test statistics ˆθ is approximately N(θ, σˆθ). and the hypotheses are H 0 : θ = θ 0, H a : θ > θ 0 H 0 : θ = θ 0, H a : θ < θ 0 H 0 : θ = θ 0, H a : θ θ 0

Examples In all these examples, the significance level is assumed to be α = 0.05. 1. Toss coin 100 times, and get 33 heads. Is this a fair coin?

2. Do indoor cats live longer than wild cats? Cats Sample size Mean age Sample Std Indoor 64 14 4 Wild 36 10 5

3. In order to test if there is any significant difference between opinions of males and females on abortion, indepedent random samples of 100 males and 150 females were taken. Sex Sample size Favor Oppose Male 100 52 48 Female 150 95 55

P-value P-value: The probability of getting an outcome as extreme or more extreme than the actually observed data (under the assumption that the null hypothesis is true). Remark: Given a significance level α, 1. If P-value α, reject the null hypothesis. 2. If P-value > α, accept the null hypothesis.

Redo all the previous examples to find P-value.

Sample Size Suppose the population distribution is N(µ, σ 2 ) with σ 2 known. Consider the test H 0 : µ = µ 0, H a : µ > µ 0. Pick a sample size so that the type I error is bounded by α and the type II error is bounded by β when µ = µ a. n [ (zα + z β )σ µ a µ 0 ] 2

Remark: The same argument extends to large sample testing. In particular, the binomial setting. Suppose X 1, X 2,..., X n are iid Bernoulli with P (X i = 1) = p = 1 P (X i = 0). Consider the test H 0 : p = p 0, H a : p > p 0. Pick a sample size so that the type I error is bounded by α and the type II error is bounded by β when p = p a. n [ ] 2 z α p0 (1 p 0 ) + z β pa (1 p a ) p a p 0

Examples 1. Suppose it is required to test population mean H 0 : µ = 5, H a : µ > 5 at level α = 0.05 such that type II error is at most 0.05 when true µ = 6. How large should the sample be when σ = 4.

2. How many tosses of a coin should be made in order to test H 0 : p = 0.5, H a : p > 0.5 at level α = 0.5 and when true p = 0.6 type II error is 0.1?

Neyman-Pearson Lemma Suppose {X 1, X 2,..., X n } are iid samples with common density f(x; θ). Consider the following simple hypotheses. H 0 : θ = θ 0, H a : θ = θ a. Question: Among all the possible rejection regions RR such that the type I error satisfies P (RR θ = θ 0 ) α with α pre-specified, which RR gives the maximal power (or minimal type II error)?

Neyman-Pearson Lemma. Define for each k RR k. = {(x 1, x 2,..., x n ) : Suppose there is a k such that } f(x 1, θ 0 )f(x 2, θ 0 ) f(x n, θ 0 ) f(x 1, θ a )f(x 2, θ a ) f(x n, θ a ) k. P (RR k θ = θ 0 ) = α, then RR k attains the maximal power among all tests whose type I error are bounded by α.

Examples 1. Consider the following test for density f. { 1, 0 < x < 1, H 0 : f(x) =, H 0, elsewhere a : f(x) = { 2x, 0 < x < 1, 0, elsewhere Find the most powerful test at significance level α based on a single observation..

2. Suppose X 1, X 2,..., X n are iid N(µ, σ 2 ) with σ 2 known. We wish to test H 0 : µ = µ 0, H a : µ = µ a (µ a < µ 0 ) Find the most powerful test at significance level α. (U.M.P) What can we say about the test H 0 : µ = µ 0, H a : µ < µ 0. What can we say about the test (no U.M.P) H 0 : µ = µ 0, H a : µ µ 0.

3. Let X has density f(x, θ) = { 2θx + 2(1 θ)(1 x), 0 < x < 1, 0, elsewhere Consider test with θ 0 < θ 1 and significance level α. H 0 : θ = θ 0, H a : θ = θ 1