Tests for Population Proportion(s)

Similar documents
Binomial and Poisson Probability Distributions

Probability and Probability Distributions. Dr. Mohammed Alahmed

Sampling Techniques. Esra Akdeniz. February 9th, 2016

Chapter Six: Two Independent Samples Methods 1/51

Comparison of Two Population Means

Solution: First note that the power function of the test is given as follows,

BINF702 SPRING 2015 Chapter 7 Hypothesis Testing: One-Sample Inference

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

Hypothesis Testing Problem. TMS-062: Lecture 5 Hypotheses Testing. Alternative Hypotheses. Test Statistic

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Lecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 9.1-1

10.4 Hypothesis Testing: Two Independent Samples Proportion

MATH 728 Homework 3. Oleksandr Pavlenko

PubH 5450 Biostatistics I Prof. Carlin. Lecture 13

Goodness of Fit Tests: Homogeneity

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

MTMS Mathematical Statistics

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Inferences About Two Proportions

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 15 Sampling Distribution Models

Chapter 6 Sampling Distributions

Two-Sample Inference for Proportions and Inference for Linear Regression

Lecture 11 - Tests of Proportions

Smoking Habits. Moderate Smokers Heavy Smokers Total. Hypertension No Hypertension Total

Stat 231 Exam 2 Fall 2013

MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1. MAT 2379, Introduction to Biostatistics

Lecture 01: Introduction

, 0 x < 2. a. Find the probability that the text is checked out for more than half an hour but less than an hour. = (1/2)2

1 Statistical inference for a population mean

Stat 315: HW #6. Fall Due: Wednesday, October 10, 2018

Basic Statistics and Probability Chapter 9: Inferences Based on Two Samples: Confidence Intervals and Tests of Hypotheses

CHAPTER 8. Test Procedures is a rule, based on sample data, for deciding whether to reject H 0 and contains:

CHAPTER 9, 10. Similar to a courtroom trial. In trying a person for a crime, the jury needs to decide between one of two possibilities:

Math 1040 Final Exam Form A Introduction to Statistics Fall Semester 2010

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

STAC51: Categorical data Analysis

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. describes the.

Exercise 1. Exercise 2. Lesson 2 Theoretical Foundations Probabilities Solutions You ip a coin three times.

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Chapter 2: Describing Contingency Tables - I

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator

Medical statistics part I, autumn 2010: One sample test of hypothesis

Lecture 7: Confidence interval and Normal approximation

Two Sample Problems. Two sample problems

Comparing p s Dr. Don Edwards notes (slightly edited and augmented) The Odds for Success

Non-parametric methods

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

T test for two Independent Samples. Raja, BSc.N, DCHN, RN Nursing Instructor Acknowledgement: Ms. Saima Hirani June 07, 2016

Chapter 6. Estimates and Sample Sizes

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Lecture 3: Measures of effect: Risk Difference Attributable Fraction Risk Ratio and Odds Ratio

Important note: Transcripts are not substitutes for textbook assignments. 1

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

Job Training Partnership Act (JTPA)

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Lecture on Null Hypothesis Testing & Temporal Correlation

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Population 1 Population 2

15: Regression. Introduction

Review. More Review. Things to know about Probability: Let Ω be the sample space for a probability measure P.

Lecture 5 : The Poisson Distribution

*X202/13/01* X202/13/01. APPLIED MATHEMATICS ADVANCED HIGHER Statistics NATIONAL QUALIFICATIONS 2013 TUESDAY, 14 MAY 1.00 PM 4.

Formulas and Tables by Mario F. Triola

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation

1 Binomial Probability [15 points]

The t-statistic. Student s t Test

Lecture #16 Thursday, October 13, 2016 Textbook: Sections 9.3, 9.4, 10.1, 10.2

Chapter 9. Hypothesis testing. 9.1 Introduction

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102

MATH20802: STATISTICAL METHODS EXAMPLES

Section Inference for a Single Proportion

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

For more information about how to cite these materials visit

Section 9.5. Testing the Difference Between Two Variances. Bluman, Chapter 9 1

STAT 285: Fall Semester Final Examination Solutions

Power and Sample Size Bios 662

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

Two-sample Categorical data: Testing

Lecture 2: Probability and Distributions

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

The area under a probability density curve between any two values a and b has two interpretations:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

STAT100 Elementary Statistics and Probability

Exam Applied Statistical Regression. Good Luck!

Data Analysis and Statistical Methods Statistics 651

[ z = 1.48 ; accept H 0 ]

Chapter 22. Comparing Two Proportions. Bin Zou STAT 141 University of Alberta Winter / 15

DISCRETE PROBABILITY DISTRIBUTIONS

SCHOOL OF MATHEMATICS AND STATISTICS

STAT 705: Analysis of Contingency Tables

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Transcription:

Tests for Population Proportion(s) Esra Akdeniz April 6th, 2016

Motivation We are interested in estimating the prevalence rate of breast cancer among 50- to 54-year-old women whose mothers have had breast cancer. Suppose that in a random sample of 10,000 such women, 400 are found to have had breast cancer at some point in their lives. The best point estimate of the prevalence rate p is the sample proportion p = 400 10000 = 0.04 Given large studies, assume the prevalence rate of breast cancer for U.S. women in this age group is about 2%. The question is: How compatible is the sample rate of 4% with a population rate of 2%?

Motivation Another way of asking this question is to restate it in terms of hypothesis testing: p = prevalence rate of breast cancer in 50- to 54-year-old women whose mothers have had breast cancer H 0 : p = 0.02 = p 0 vs. H 1 : p 0.02 How do we test this hypothesis?

Introduction So far, we dealt with hypothesis concerning population mean. In this lecture, we will learn how to make inference on sample proportions: proportion of times an event occurs rather than the number of times. Example: Every year, approximately 3500 babies were delivered in the Vienna General Hospital in the mid-nineteenth century. Approximately 500 women develop puerperal fever an infection developing during childbirth. Assume X Bin(n, p). Normal approximation to binomial distribution: IF np 5 AND n(1 p) 5, then under H 0, p is approximately normally distributed.

Test about a population proportion Let p denote the proportion of individuals or objects in a population who possess a specified property (labeled as S ). Let X be the number of Ss in the sample. Then p = X n is the sample proportion. X is a binomial random variable with parameters p and n, i.e. X Bin(n, p). Furthermore, when the sample size n itself is large, both X and p are approximately normally distributed, i.e. X N(np, np(1 p)) and p N(p, p(1 p) n ). Test about population proportion p will depend on the sample size.

Test about a population proportion Large-Sample Tests When the sample size is large (n 30) (IF np 5 AND n(1 p) 5), p is approximately normally distributed with mean p and variance p(1-p)/n. In particular, under the null hypothesis H 0 : p = p 0, p is approximately normally distributed with mean p 0 and variance p 0(1 p 0)/n, i.e. p N(p 0, p 0(1 p 0)/n). Therefore the test statistic Z = p p 0 p0(1 p 0)/n has approximately a standard normal distribution.

Example For breast cancer example we compute the test statistic z = = p p 0 p0(1 p 0)/n 0.04 0.02 = 0.02 0.02(0.98)/10000 0.0014 = 14.3 z 1 α/2 = z 0.975 = 1.96 Since 14.3 > 1.96, H 0 is rejected using two sided test with α = 0.05.

Figure: Acceptance and rejection regions for the one-sample binomial test, normal-theory method (two-sided alternative)

Example Suppose that we select a random sample of 30 individuals from the population of adults in Turkey. Assume that the probability that a member of this population currently smokes cigarettes, cigars or pipes is equal to 0.29. Therefore, the total number of smokers in the sample is binomial with n = and p =. For this sample, what is the probability that six or fewer of its members smoke? Continuity correction.

Sample proportion and its distribution Sample proportion is denoted as ˆp and ˆq = 1 ˆp. Its distribution is normal with mean= and variance =, using CLT. Confidence interval for p: ( p z 1 α/2 p q/n, p + z1 α/2 p q/n )

Hypothesis Testing Null hypothesis: H 0 : p = p 0 OR H 0 : p p 0 OR H 0 : p p 0. Test statistic: z = ˆp p 0 p0 (1 p 0 ) n

Example Consider the distribution of five-year survival for individuals under 40 who have been diagnosed with lung cancer. This distribution has an unknown population mean p. In a randomly selected sample of 52 patients, only six survive five-years. Compute the sample proportion. Find the 95% confidence interval for the population proportion. Test the hypothesis that the population proportion is equal to 0.082 at significance level 0.05.

Comparison of two Population Proportions Assume X Bin(m, p 1), Y Bin(n, p 2) and they are independent. Normal approximation under conditions: m ˆp 1 5, m(1 ˆp 1) 5, n ˆp 2 5, n(1 ˆp 2) 5. Confidence interval: ( p 1 p 2 ± z 1 α/2 p1(1 p 1) m + ) p2(1 p2) n

Hypothesis Testing Null hypothesis: H 0 : p 1 p 2 = 0 Test statistic: z = ˆp 1 ˆp 2 0 ˆp(1 ˆp) m + ˆp(1 ˆp) n N(0, 1) Under H 0 the proportions are assumed equal therefore a common p value is estimated by ˆp = x1 + x2 m + n

Example In a study investigating morbidity and mortality among pediatric victims of motor vehicle accidents, information regarding effectiveness of seat belts was collected over an 18-month period. Two random samples were selected, one from the population of children who were wearing a seat belt at the time of the accident, and the other from the population the population who were not. In the sample of 123 children who were wearing a seat belt at the time of the accident, 3 died. In the sample of 290 children who were not wearing a seat belt, 13 died. We want to test whether the population proportions of these two populations are equal at significance level 0.05.