Estimating Proportions

Similar documents
Confidence Intervals

p we will use that fact in constructing CI n for population proportion p. The approximation gets better with increasing n.

The Hong Kong University of Science & Technology ISOM551 Introductory Statistics for Business Assignment 3 Suggested Solution

To make comparisons for two populations, consider whether the samples are independent or dependent.

Chapter 8: Estimating with Confidence

MATH/STAT 352: Lecture 15

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Confidence intervals for proportions

Distribution of Sample Proportions

Data Description. Measure of Central Tendency. Data Description. Chapter x i

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Understanding Samples

STAT-UB.0103 NOTES for Wednesday 2012.APR.25. Here s a rehash on the p-value notion:

Expectation and Variance of a random variable

Chapter 9, Part B Hypothesis Tests

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

Basics of Inference. Lecture 21: Bayesian Inference. Review - Example - Defective Parts, cont. Review - Example - Defective Parts

Final Examination Solutions 17/6/2010

Chapter 1 (Definitions)

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

MEASURES OF DISPERSION (VARIABILITY)

Understanding Dissimilarity Among Samples

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Sample Size Determination (Two or More Samples)

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Class 27. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

18. Two-sample problems for population means (σ unknown)

Statistics 511 Additional Materials

Estimation of a population proportion March 23,

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Confidence Intervals for the Population Proportion p

Sampling Distributions, Z-Tests, Power

S160 #12. Sampling Distribution of the Proportion, Part 2. JC Wang. February 25, 2016

Chapter 23: Inferences About Means

Homework 5 Solutions

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

(7 One- and Two-Sample Estimation Problem )

Confidence Intervals QMET103

Frequentist Inference

Hypothesis Testing. H 0 : θ 1 1. H a : θ 1 1 (but > 0... required in distribution) Simple Hypothesis - only checks 1 value

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Confidence Intervals for the Difference Between Two Proportions

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Elementary Statistics

Stat 421-SP2012 Interval Estimation Section

1 Inferential Methods for Correlation and Regression Analysis

Economics Spring 2015

CONFIDENCE INTERVALS STUDY GUIDE

TI-83/84 Calculator Instructions for Math Elementary Statistics

BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

S160 #12. Review of Large Sample Result for Sample Proportion

Chapter 18: Sampling Distribution Models

Math 140 Introductory Statistics

Chapter 20. Comparing Two Proportions. BPS - 5th Ed. Chapter 20 1

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Stat 139 Homework 7 Solutions, Fall 2015

Common Large/Small Sample Tests 1/55

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Chapter 2 Descriptive Statistics

Random Variables, Sampling and Estimation

Read through these prior to coming to the test and follow them when you take your test.

Examination Number: (a) (5 points) Compute the sample mean of these data. x = Practice Midterm 2_Spring2017.lwp Page 1 of KM

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Y i n. i=1. = 1 [number of successes] number of successes = n

MA238 Assignment 4 Solutions (part a)

Statistics 300: Elementary Statistics

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

AP Statistics Review Ch. 8

Linear Regression Models

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Topic 9: Sampling Distributions of Estimators

Introduction There are two really interesting things to do in statistics.

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Solutions Descriptive Statistics. None at all!

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Central Limit Theorem the Meaning and the Usage

Nuclear Physics Worksheet

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

University of California, Los Angeles Department of Statistics. Hypothesis testing

( ) = is larger than. the variance of X V

Transcription:

3/1/018 Outlie for Today Remiders about Missig Values Iterretig Cofidece Itervals Cofidece About Proortios Proortios as Iterval Variables Cofidece Itervals Cofidece Coefficiets Examles Lab Exercise ( arts Both ivolve a lil math, A! 1 SOC364 w/ Dr. Ellis Godard -- Slide 4 Last Remiders about Missig Data SOC364 Statistics w/ Dr. Ellis Godard Estimatig Proortios 5 How do I recode to take care of missig values. Do NOT use TRANSFORM -RECODE or MISSINGS colum i VARIABLE VIEW VER uusual eed a good reaso robably NOT what you should do A gave a list of 7 solutios for missigs, twice recodig was NOT o it! Cases w/ missig values do t require actio They re already idetified as missig that s how they re couted Ivalid Values couted as valid - that s trouble! I the frequecy distributio, are ivalid values (such as "Dot Kow", "Not aly", ad "No aswer i the "Valid" grou of values, before the first Total or is there oly oe Total? If there are o roblematic values i the "Valid" grou ad all of the cases idetified as missig are i the "Missig" grou, the everythig's already bee doe for you. I Variable View, are there ivalid values i Values colum whose umbers SOC364 @ CSUN are t - Ellis listed Godard i the Missigs colum? Ilie for Today Breathe : Back-attig Questios? Descritios vary by level of measuremet So do ifereces (estimates about oulatios Nothig last lecture was es. ew 3 ideas today are (marked w/ lightig bolts Prose for Cofidece Itervals Not 5.74 ad 6.9 iterval betwee matters, ot just bouds Not just 5.74 to 6.9 What is that iterval? Not just cofidece iterval is 5.74 to 6.9 Put it i terms that ayoe ca uderstad We are 95% cofidet that the average umber of music tyes (of the 1 give liked by Americas falls somewhere betwee 5.74 ad 6.9. I geeral, We are 95% cofidet that the oulatio arameter falls withi this rage. 3 6 SOC364 w/ Dr. Ellis Godard 1

3/1/018 Cof. It. for Proortios? Remember: mea ad stadard deviatio ca oly be comuted for a iterval variable (e.g., verbal SAT scores or family icome. Cof. Iterval for iterval variables use both: ( z * ˆ ( z * s What do we do if our variables are either ordial or omial (a.k.a. categorical? Cetral Tedecy: mode or media o mea! Disersio: rage & variatio ratio o std dev! Nomial Proortio Cosider a hyothetical relative frequecy distributio for the omial variable Political Party for a samle of 500 voters: Value f r.f. Democrat 65 0.53 Reublica 0 0.44 Other 15 0.03 500 1.00 Based o this iformatio, we could say that the roortio of Democrats (versus Reublicas ad other is 0.53, or 53%. Be sure to use valid ercet! 7 10 Oly Itervals have Meas Iterval data has meas Arithmetic average Z scores cout # of std deviatios Cofidece itervals are rages, of width *z, aroud mea Nomials ad Ordials do t have meas Ca t arithmetically average Ca t cout # of std deviatios Measures distace from mea Calculated based o the mea Ca t calculate CIs aroud a mea usig stadard deviatios But Nomials & Ordials do have roortios The % that is ay value es, Male, Black, Very Satisfied Ordial Proortio Cosider a hyothetical frequecy distributio for stregth of arty loyalty (Do you suort your arty s ositio... amog a samle of 1,000 registered Democrats: Value f r.f. Most of the Time 630 0.63 Some of the Time 300 0.30 Rarely 50 0.05 Never 0 0.0 1,000 1.00 Based o this iformatio, we could say that the roortio of Disloyal Democrats (who either rarely or ever suort their arty s ositio is 0.07, or 7%. 8 11 Equatio for a Proortio A roortio is a secial case of a mea. Defie i =1 if the ith observatio is i the category of iterest (e.g. a Democrat i the first examle or disloyal Democrat i the d ad defie i =0 if the ith observatio is ot i the category. The, 1 i. Proortios are Iterval They re about a value of a omial or ordial variable But the itervals betwee them are equal ad cosistet They rage from 0% (0.0 to 100% (1.0 Remember: ercets rereset hidde decimals 50% = 50 er 100 = 50/100 = 0.50 We ca add them ad average them A set of roortios has a mea ad stadard deviatio 9 1 SOC364 w/ Dr. Ellis Godard

3/1/018 Poulatio vs. Samle Proortio Let (i equal the roortio of the defied oulatio classified i some secific category. The best (least biased ad most efficiet estimate of the oulatio roortio is the samle roortio (or, as i the text. Samlig Variace for i For omial or ordial variables, we use roortios rather tha meas, ad the formula for variace of roortios: (1 For examle, the variace for the ercet that s female, i a class of 10 studets that is 50% female, would be 0.50(1 0.50 0.50(0.50 0.5 40 10 10 10 But Professor Godard, you said ot to use the variace Excet as a ste to a stadard deviatio 13 16 Meas of Samle Proortios Proortios vary itervally amog samles Each samle has a roortio female, Athiest, tall, etc. Samle roortio Ca arithmetically average those roortios The collectio of all of them is a samlig distributio There s a hyothetical std. deviatio to all of them It s a stadard error of the samlig distributio With mea & std dev, ca calculate Zs & CIs of the oulatio mea roortio So, almost othig ew today Stadard Error for a Proortio For omial or ordial variables, the formula for variace of roortios: (1 The variace = the stadard deviatio squared, so The square root of each side gives the stadard deviatio of roortios: (1 That stadard deviatio is of a samlig distributio, so is a stadard error We estimate that stadard error by usig the samle roortio as a estimate of the oulatio roortio: 14 17 ˆ (1 Samlig Distributio of a Proortio If we calculated the samle roortio for all ossible samles of size, the distributio of these samle roortios would be aroximately ormally distributed aroud the oulatio roortio. (Stadard Error Largest ossible Whe & q are both 50%.5 x.5 =.5 But 90% female -> 0.9(1-0.9 = 0.9(0.1 = 0.09 Use that (.5 whe do t kow & 1- (aka q It s more coservative tha ay other guess Larger std error makes a wider cof. iterval, which gives looser claims about reality Review examles o.137 i the text More i ext lecture 15 18 SOC364 w/ Dr. Ellis Godard 3

3/1/018 Cof. Iterval for a Proortio C.I. for a Mea C.I. for a Proortio z z or 19 z or s z ( 1 Serious Errors to Avoid What you re redictig: Do t cofuse mea ad roortio Do t cofuse samle mea & oulatio mea Calculatig the iterval Mea: Use the stadard error (measurig disersio i the samlig distributio ot samle stadard deviatio (measurig the disersio i the samle Proortio: Use the stadard error for a roortio (which is ot the std. dev. divided by the sqrt rt of Where the s & Cof. Coefficiets are: Do t cofuse the area i the tails with the area betwee the tails of a samlig distributio; 0 Remember this? Same idea 196. 196. ( 196. ( A A A Samle B: Poulatio Mea Iside C.I. Area=0.95 196. Samle A: Poulatio Mea outside C.I. 196. ( 196. ( B B B 3 What roortio of CSUN studets have cosumed marijuaa? Poit estimate: 14 from samle of 6 = 53.8% or.538 A 95% cofidece iterval would be: z *.538 1.538* 1.538 *.538 1.96* Assumig our samle was radom & ubiased, we ca be 95% cofidet that betwee 34.63% & 7.96% of CSUN studets have smoked ot 6.486.538 1.96*.538 1.96*.00956 6 1.96*.0977.538.1916.3463.796 Prose for Proortios If we drew reeated samles of the same size from a oulatio ad calculated the 95 ercet cofidece itervals for each samle s roortio the we would exect that the oulatio roortio would fall i the cofidece iterval 95 ercet of the time. AP Style Guide o Polls Do ot exaggerate oll results. I articular, with re-electio olls, these are the rules for decidig whe to write that the oll fids oe cadidate is leadig aother: -- If the differece betwee the cadidates is more tha twice the samlig error margi*, the the oll says oe cadidate is leadig. -- If the differece is less tha the samlig error margi, the oll says that the race is close, that the cadidates are "about eve. (Do ot use the term "statistical dead heat," which is iaccurate if there is ay differece betwee the cadidates; if the oll fids the cadidates are tied, say they're tied. -- If the differece is at least equal to the samlig error but o more tha twice the samlig error, the oe cadidate ca be said to be "aaretly leadig" or "slightly ahead" i the race. * They mea stadard error (oos 1 4 SOC364 w/ Dr. Ellis Godard 4

3/1/018 Proortio Examle from SPSS Lab, Part I (ot i SPSS Last lab was C.I. for the iterval idex MUSIC Could also do omial, e.g. roortio who like folk Poit estimate (from frequecy table: 47.8% Iterval estimate: If the samle roortio is 50% ad the stadard error is 10%, costruct ad iterret 1. a 90% cofidece iterval z * * (1.478 1.96 *. 478.06.45.530.478 * (. 5 149 Iterretatio: We ca be 95% cofidet that, i the oulatio as whole, betwee 45. ad 53.0 ercet like folk. Note that this icludes the ossibility that more tha half like folk But we ca t be cofidet that a majority do. It s ossible that less tha a majority do too close to call. a 95% cofidece iterval 3. a 95.44% cofidece iterval 6 9 Goig Further w/ C.I. s Comare them, e.g. if we estimate that: 10-0% like hot dogs 15-30% like burgers 40-50% like izza. burgers hotdogs izza Lab, Part II (artly i SPSS Get % s from freq. distributios i SPSS Do t get S.E. or C.I. from SPSS! Assumes iterval! 4. Calculate a 95% cofidece iterval for the roortio who like Ra 10-------------15-------------0-------------5-------------30-------------35-------------40-------------45-------------50 If ay overla, ca t coclude there s ay differece Both arameters could be same value (e.g. 19% like hotdogs & burgers Could be the reverse of what it aears (19% hotdogs, 16% burgers Rest of the CI does t matter (e.g. burger mostly above hotdog? irrelevat But if the itervals do t overla, ca coclude arameters differ We re 95% cofidet that izza is more oular tha either oe 5. Calculate a 95% cofidece iterval for the roortio who like Oera 6. Calculate a 95% cofidece iterval for the roortio who like Bluegrass 7. Comare those itervals draw a icture, ad make coclusios based o how the itervals comare to each other! 7 30 Lab Hits There are two arts, that cout as oe lab. Remember, ercetages are iterval They iclude two hidde decimals : 50% = 50 er 100 = 50/100 = 0.50 8 SOC364 w/ Dr. Ellis Godard 5