Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Similar documents
Last Lecture. Wald Test

Common Large/Small Sample Tests 1/55

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

5. Likelihood Ratio Tests

Stat 319 Theory of Statistics (2) Exercises

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Parameter, Statistic and Random Samples

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Topic 9: Sampling Distributions of Estimators

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

Topic 9: Sampling Distributions of Estimators

SDS 321: Introduction to Probability and Statistics

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Frequentist Inference

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Statistics 20: Final Exam Solutions Summer Session 2007

independence of the random sample measurements, we have U = Z i ~ χ 2 (n) with σ / n 1. Now let W = σ 2. We then have σ 2 (x i µ + µ x ) 2 i =1 ( )

Properties and Hypothesis Testing

Topic 9: Sampling Distributions of Estimators

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Expectation and Variance of a random variable

Sampling Distributions, Z-Tests, Power

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Class 23. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Topic 18: Composite Hypotheses

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

STAT431 Review. X = n. n )

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Stat 200 -Testing Summary Page 1

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

MATH/STAT 352: Lecture 15

Chapter 5: Hypothesis testing

Confidence Level We want to estimate the true mean of a random variable X economically and with confidence.

Data Analysis and Statistical Methods Statistics 651

Lecture 2: Monte Carlo Simulation

Chapter 11: Asking and Answering Questions About the Difference of Two Proportions

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

GG313 GEOLOGICAL DATA ANALYSIS

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

STAC51: Categorical data Analysis

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Random Variables, Sampling and Estimation

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Lecture 7: Properties of Random Samples

STATISTICAL INFERENCE

1 Review of Probability & Statistics

HYPOTHESIS TESTS FOR ONE POPULATION MEAN WORKSHEET MTH 1210, FALL 2018

LESSON 20: HYPOTHESIS TESTING

4. Partial Sums and the Central Limit Theorem

UCLA STAT 110B Applied Statistics for Engineering and the Sciences

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Module 1 Fundamentals in statistics

The standard deviation of the mean

6 Sample Size Calculations

University of California, Los Angeles Department of Statistics. Hypothesis testing

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Problem Set 4 Due Oct, 12

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Successful HE applicants. Information sheet A Number of applicants. Gender Applicants Accepts Applicants Accepts. Age. Domicile

Statistical Inference

Topic 5 [434 marks] (i) Find the range of values of n for which. (ii) Write down the value of x dx in terms of n, when it does exist.

1 Inferential Methods for Correlation and Regression Analysis

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Mathematical Notation Math Introduction to Applied Statistics

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

1036: Probability & Statistics

Chapter 22: What is a Test of Significance?

32 estimating the cumulative distribution function

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Lecture 12: November 13, 2018

Statistics 3858 : Likelihood Ratio for Multinomial Models

f(x)dx = 1 and f(x) 0 for all x.

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

UNIT 2 DIFFERENT APPROACHES TO PROBABILITY THEORY

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Power and Type II Error

IE 230 Seat # Name < KEY > Please read these directions. Closed book and notes. 60 minutes.

Statisticians use the word population to refer the total number of (potential) observations under consideration

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

SOLUTIONS y n. n 1 = 605, y 1 = 351. y1. p y n. n 2 = 195, y 2 = 41. y p H 0 : p 1 = p 2 vs. H 1 : p 1 p 2.

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Chapter 23: Inferences About Means

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Math 132, Fall 2009 Exam 2: Solutions

Introduction to Probability and Statistics Twelfth Edition

TMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium

A statistical method to determine sample size to estimate characteristic value of soil parameters

Transcription:

Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the author reported a experimet, performed by a high school studet ad a youger brother, which cosisted of tossig five cois ad recordig the frequecies for the umber of heads i the five cois. The data collected are show i Table 1. Number of Heads 0 1 2 3 4 5 Frequecy 100 524 1080 1126 655 105 Table 1: Frequecy Distributio for a Five Coi Tossig Experimet a) Are the data i Table 1 cosistet with the hypothesis that all the cois were fair? Justify your aswer. Solutio: If we let X deote the umber of heads observed i the five coi toss, ad all the cois are fair, the X biomial5, 0.5). Thus, the probability that we will see k cois out of the 5 showig heads is ) ) k ) 5 k 5 1 1 p X k) =, for k = 0, 1, 2, 3, 4, 5. k 2 2 Thus, out of of the = 3590 tosses of the five cois, o average, we expect to see p X k) of them showig k heads. These expected values are show i Table 2. The table also shows the expected couts. We ca therefore compute the value of the Pearso Chi Square statistic to be ˆQ = 21.57. I this case, the Pearso Chi-Square statistic has a approximate χ 2 5) distributio sice there are 6 categories. The p value of the goodess of fit test is the, approximately, p value = PQ > ˆQ) 0.0006,

Math 152. Rumbos Fall 2009 2 Category p k Predicted Observed k) Couts Couts 0 0.03125 112.1875 100 1 0.15625 560.9375 524 2 0.31250 1121.875 1080 3 0.31250 1121.875 1126 4 0.15625 560.9375 655 5 0.03125 112.1875 105 Table 2: Couts Predicted by the Biomial Model which is very small. Thus, we may reject the ull hypothesis that the data i Table 1 follows a biomial distributio at the 1% sigificace level. Therefore, we ca say that the data do ot support the assumptio that the five cois are fair. b) Assume ow that the cois have the same probability, p, of turig up heads. Estimate p ad perform a goodess of fit test of the model you used to do your estimatio. What do you coclude? Solutio: Suppose ow that the cois are ot fair but they all have the same probability, p, of turig up head. We ca estimate p from the data as follows: 5 ˆp = 0 100 + 1 524 + 2 1080 + 3 1126 + 4 655 + 5 105, 3590 from which we get that We ow test the ull hypothesis ˆp 0.5129. H o : X biomial5, ˆp). I this case we get the expected couts show i Table 3 o page 3. The Pearso Chi Square statistic, Q, has the value ˆQ 8.75, ad the approximate p value is p value = PQ > ˆQ) 0.068, sice Q has a approximate χ 2 4) statistic i this case because we estimated p from the data. Thus, we caot reject the ull

Math 152. Rumbos Fall 2009 3 Category p k Predicted Observed k) Couts Couts 0 0.02742 98.443 100 1 0.14437 518.286 524 2 0.30403 1091.476 1080 3 0.32014 1149.288 1126 4 0.16855 605.081 655 5 0.03549 127.426 105 Table 3: Couts Predicted by the biomial5, ˆp) Model hypothesis at the 5% sigificace level, but we could reject at the 10% level of sigificace. Hece, the data gives moderate support to the hypothesis that the are slightly loaded towards yieldig more heads o average. 2. I 1, 000 tosses of a coi, 560 yield heads ad 440 tur up tails. Is it reasoable to assume that the coi if fair? Justify your aswer. Solutio: Test the hypothesis H o : p = 1 2 versus the alterative H 1 : p > 1 2. We model the tosses by a sequece of = 1000 idepedet Beroullip) trials, X 1, X 2,..., X ad form the test statistic Y = X j. j=1 We reject the ull hypothesis if Y > 500 + c, for certai critical value c, determied by the level of sigificace, α, of the test. I this case, α = PY + 500 > c) for Y biomial1000, 0.5).

Math 152. Rumbos Fall 2009 4 Usig the Cetral Limit Theorem, we have that ) c α P Z >, 1000 0.5)1 0.5) where Z ormal0, 1). Thus, if we let z α deote a value such that PZ > z α ) = α, we have that we ca reject H o at the α sigificace level if Y > 500 + z α 1000/4. if α = 0.05, z α is the value of z which yields F Z z) = 1 α. Thus, z α = F 1 0.95) 1.65. We will the reject the ull hypothesis Z if Y > 526. I this case, the observed value of Y is ˆY = 560. Hece, we may reject the ull hypothesis at the 5% level of sigificace ad coclude that the data led evidece to hypothesis that the coi is biased towards more heads. 3. I a radom sample, X 1, X 2,..., X, of Beroullip) radom variables, it is desired to test the hypotheses H o : p = 0.49 versus H 1 : p = 0.51 Use the Cetral Limit Theorem to determie, approximately, the sample size,, eeded to have the probabilities of Type I error ad Type II error to be both about 0.01. Explai your reasoig. Solutio: We use Y = i=1 X i p 1 = 0.51, ad defie the rejectio regio as a test statistic. Put p o = 0.49 ad R: Y > c, where c is some critical. We the have that the probability of a Type I error is α = PY > c), give that Y biomial, p o ).

Math 152. Rumbos Fall 2009 5 Similarly, the probability of a Type II error is β = PY c), give that Y biomial, p 1 ). We approximate these errors usig the Cetral Limit Theorem as follows: ) Y p o α = P po 1 p o ) > c p o po 1 p o ) P Z > c p o po 1 p o ) ), where Z ormal0, 1). Thus, we set c p o po 1 p o ) = z α, 1) where z α is the real value with the property that PZ > z α ) = α. For the probability of a Type II error we get ) Y p 1 β = P p1 1 p 1 ) c p 1 p1 1 p 1 ) Thus, we may set P Z c p 1 p1 1 p 1 ) ). c p 1 p1 1 p 1 ) = z β, 2) where z β is the real value with the property that F Z z β ) = β. For the case i which α = β = 0.01, we have z α 2.33 ad z β 2.33. Equatios 1) ad 2) the become ad c = 2.33 p o 1 p o ) + p o 3) c p 1 = 2.33 p 1 1 p 1 ). 4) Subtractig 4) from 3) leads to p 1 p o ) = 2.33 po 1 p o ) + ) p 1 1 p 1 ),

Math 152. Rumbos Fall 2009 6 which leads to 2.33 = po 1 p o ) + ) p 1 1 p 1 ). 5) p 1 p o Substitutig the values for p o ad p 1 i 5) we obtai 116.5, so that we wat to be at least 13, 567. 4. Let X 1, X 2,..., X be a radom sample from a ormalθ, 1) distributio. Suppose you wat to test H o : θ = θ o versus H 1 : θ = θ o, with the rejectio regio defied by X θ o > c, for some critical value c. a) Fid ad expressio i terms of stadard ormal probabilities for the power fuctio of this test. Solutio: The power fuctio of this test, γθ) is the probability that the the test will reject the ull hypothesis whe θ = θ o ; that is, γθ) = P X θ o > c ) give that X ormalθ, 1/), for θ = θ o. Thus, we ca write γθ) as γθ) = 1 P X θ o c ) = 1 P θ o = 1 P θ o θ c < X θ o + c ) c < X θ θ o θ + c ) θo = 1 P θ) c < X θ 1/ ) θ o θ) + c = 1 P θ o θ) c < Z θ o θ) + c ), where Z ormal0, 1). We therefore have that γθ) = 1 F Z θ o θ) + c) F Z θ o θ) c) ), 6) where F Z deotes the cdf of the stadard ormal distributio.

Math 152. Rumbos Fall 2009 7 b) A experimeter desires a Type I error probability of 0.04 ad a maximum Type II error probability of 0.25 at θ = θ o + 1. Fid the values of ad c for which these coditios ca be achieved. Solutio: The probability of a Type I error is γθ o ) where γθ) is give i Equatio 6). Thus, α = γθ o ) = 1 F Z c) F Z c)) = 2 2F Z c). Thus, if α = 0.04, we eed to set c so that which yields F Z c) = 0.98, c 2.05. The probability of a Type II error for θ = θ o + 1 is β = 1 γθ o + 1) = 1 1 F Z + c) F Z c))) = F Z + c) F Z c) = P c < Z + c) P < Z + c) = F Z + c). Thus, i order to make β 0.25, we require that F Z + c) = 0.25. This yields + c 0.675. Thus, c + 0.675) 2 7.43. Thus, we may take to be at least 8. 5. Let X 1, X 2,..., X be a radom sample from a ormalθ, σ 2 ) distributio. Suppose you wat to test H o : θ θ o

Math 152. Rumbos Fall 2009 8 versus with the rejectio regio defied by T θ) > H 1 : θ > θ 1 θ o θ) + c, for some critical value c. Here, T θ) is the statistic T θ) = X θ), where X ad S 2 are the sample mea ad variace, respectively. a) If the sigificace level for the test is to be set at α, what should c be? Solutio: The power fuctio of this test is ) γθ) = P θ T θ) > θ o θ) + c, where T θ) t 1); that is, T θ) has a t distributio with 1 degrees of freedom. Observe that, if the ull hypothesis is true, the θ θ o ad therefore θ o θ) + c c for all θ θ o. It the follows that ) P θ T θ) > θ o θ) + c PT θ) > c). Thus, α = sup θ θ o γθ) = PT θ) > c). where T θ) t 1). Thus, to choose c, we fid a real value, t, such that PT > t) = α, where T t 1). Deotig that value by t α, 1, we get that c = t α, 1.

Math 152. Rumbos Fall 2009 9 b) Express the rejectio regio i terms of the value c foud i part a), ad the statistics X ad S 2. Solutio: The rejectio regio is X θ) > θ o θ) + t α, 1, which ca be re-writte as X > θ o + t α, 1. c) Compute the power fuctio, γθ), for the test. Solutio: From part a) of this problem we have that ) γθ) = P θ T θ) > θ o θ) + t α, 1 = 1 P θ T ) θ o θ) + t α, 1, where T t 1). Hece, the power fuctio of the test is ) γθ) = 1 F T θ o θ) + t α, 1 for θ > θ 1. 6. A sample of 16 10 ouce cereal boxes has a mea weight of 10.4 oz ad a stadard deviatio of 0.85 oz. Perform a appropriate test to determie whether, o average, the 10 ouce cereal boxes weigh somethig other tha 10 ouces at the α = 0.05 sigificace level. Explai your reasoig. Solutio: We assume that the weight i each 10 ouce cereal box follows a ormalμ, σ 2 ) distributio with mea μ ad variace σ 2. We would like test the hypothesis H o : μ = 10 oz agaist the alterative hypothesis H 1 : μ = 10 oz.

Math 152. Rumbos Fall 2009 10 We cosider the rejectio regio R : X μ o > t α/2, 1 where μ o = 10 oz, ad t α/2, 1 is chose so that P T > t α/2, 1 ) = α, for T t 1). The, if H o is true, the statistic T = X μ o /, where = 16, has a t 1) distributio, sice we are assumig the the sample, X 1, X 2,..., X, comes from a ormalμ o, σ 2 ) distributio. Cosequetly, the test has sigificace level α. I the special case i which α = 0.05, we get that t α/2, 1 2.13. Thus, the ull hypothesis ca be rejected at the 0.05 sigificace level if X μ o > 2.13. 16 I this problem, X = 10.4, = 0.85, ad = 16. We the have that X μ o / 1.88, which is ot bigger tha 2.13, thus we caot reject the ull hypothesis at the 0.05 sigificace level. 7. Fid the p value of observed data cosistig of 7 successes i 10 Beroulliθ) trials i a test of H o : θ = 1 versus H 1 : θ > 1 2 2. Solutio: Let Y deote the umber of successes i the = 10 trials. The Y biomial10, θ). This is the test statistic. The p value is the probability that, if the ull hypothesis is true, we will see the observed value of the statistic or more extreme oes. I this case, if the ull hypothesis is true, Y biomial10, 0.5) ad the p value is p value = PY 7) = 10 k=7 ) 10 1 k 2 10 0.1719

Math 152. Rumbos Fall 2009 11 8. Three idepedet observatios from a Poissoλ) distributio yield the values x 1 = 3, x 2 = 5 ad x 3 = 1. Explai how you would use these data to test the hypothesis H o : λ = 1 versus the alterative H 1 : λ > 1. Come up with a appropriate statistic ad rejectio criterio ad determie the p value give by the data. What do you coclude? Solutio: Deote the observatios by X 1, X 2, X 3. The, X 1, X 2 ad X 3 are idepedet Poissoλ) radom variables. Defie the test statistic Y = X 1 + X 2 + X 3. The, Y Poisso3λ). The p value is the probability that the test statistic will take o the observed value, or more extreme oes, uder the assumptio that H o is true; that is, Y Poisso3). Thus, p value = PY 9) = 1 PY 8) = 1 8 k=0 0.0038. 3 k k! e 3 A rejectio regio is determied by the sigificace level that we set. For istace, if the sigificace level is α, the we ca have the rejectio criterio p value < α Reject H o. Thus, i this case, we ca reject H o at the α = 0.01 sigificace level, ad coclude that the data support the hypothesis that λ > 1.