µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

Similar documents
Estimation of a population proportion March 23,

Statistics 511 Additional Materials

Topic 9: Sampling Distributions of Estimators

Chapter 8: Estimating with Confidence

Continuous Data that can take on any real number (time/length) based on sample data. Categorical data can only be named or categorised

1 Inferential Methods for Correlation and Regression Analysis

Properties and Hypothesis Testing

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Topic 9: Sampling Distributions of Estimators

Sampling Distributions, Z-Tests, Power

Infinite Sequences and Series

Chapter 23: Inferences About Means

This is an introductory course in Analysis of Variance and Design of Experiments.

Topic 9: Sampling Distributions of Estimators

NUMERICAL METHODS FOR SOLVING EQUATIONS

Confidence Intervals for the Population Proportion p

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

6.3 Testing Series With Positive Terms

Topic 10: Introduction to Estimation

MA131 - Analysis 1. Workbook 3 Sequences II

Chapter 8 Interval Estimation

ANALYSIS OF EXPERIMENTAL ERRORS

Recall the study where we estimated the difference between mean systolic blood pressure levels of users of oral contraceptives and non-users, x - y.

Economics 250 Assignment 1 Suggested Answers. 1. We have the following data set on the lengths (in minutes) of a sample of long-distance phone calls

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Chapter 6. Sampling and Estimation

Frequentist Inference

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

MA131 - Analysis 1. Workbook 2 Sequences I

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Chapter 4. Fourier Series

Confidence Intervals QMET103

PH 425 Quantum Measurement and Spin Winter SPINS Lab 1

Stat 421-SP2012 Interval Estimation Section

Probability and Statistics Estimation Chapter 7 Section 3 Estimating p in the Binomial Distribution

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

CONFIDENCE INTERVALS STUDY GUIDE

Sequences I. Chapter Introduction

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

NCSS Statistical Software. Tolerance Intervals

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Number of fatalities X Sunday 4 Monday 6 Tuesday 2 Wednesday 0 Thursday 3 Friday 5 Saturday 8 Total 28. Day

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Computing Confidence Intervals for Sample Data

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

The standard deviation of the mean

A statistical method to determine sample size to estimate characteristic value of soil parameters

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

NUMERICAL METHODS COURSEWORK INFORMAL NOTES ON NUMERICAL INTEGRATION COURSEWORK

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

Stat 200 -Testing Summary Page 1

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Understanding Samples

Statistical Intervals for a Single Sample

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Introducing Sample Proportions

Final Examination Solutions 17/6/2010

UNIT 8: INTRODUCTION TO INTERVAL ESTIMATION

Confidence Intervals

4.3 Growth Rates of Solutions to Recurrences

Homework 5 Solutions

24.1. Confidence Intervals and Margins of Error. Engage Confidence Intervals and Margins of Error. Learning Objective

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Estimation for Complete Data

Analysis of Experimental Measurements

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Statisticians use the word population to refer the total number of (potential) observations under consideration

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

GG313 GEOLOGICAL DATA ANALYSIS

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Activity 3: Length Measurements with the Four-Sided Meter Stick

Alternating Series. 1 n 0 2 n n THEOREM 9.14 Alternating Series Test Let a n > 0. The alternating series. 1 n a n.

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

Confidence Interval for one population mean or one population proportion, continued. 1. Sample size estimation based on the large sample C.I.

Parameter, Statistic and Random Samples

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Statistics 300: Elementary Statistics

(7 One- and Two-Sample Estimation Problem )

Sample Size Determination (Two or More Samples)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Introducing Sample Proportions

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

Power and Type II Error

Continuous Functions

Median and IQR The median is the value which divides the ordered data values in half.

AP Statistics Review Ch. 8

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

Chapter 2 Descriptive Statistics

Transcription:

Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example use the sample mea commutig time of 31.1. miutes from the sample of 36 studets (week 1) as a approximatio of the mea commutig time of all studets. Similarly, we could use the fact that 5.6% of the studets i the sample of 36 commuted to campus by bike as a approximatio to the populatio proportio of all studets who commute to ui by bike. Expressig these two specific approximatios (or estimatios) symbolically we could write; µ 31.1 (mis) ad π 0.056 x Ad, more geerally, i.e. the populatio mea is approximately equal to the mea of a sample the populatio proportio is approximately equal to a sample proportio µ ad π p

Poit Estimatio(cot.) I poit estimatio we are simply suggestig that the populatio parameter (µ or π (for ow)) is located at a poit somewhere ear to the poit where we kow or p is located. x Schematically, x µ aroud this poit (somewhere) p π aroud this poit (somewhere) Ufortuately whe usig this rather elemetary estimatio techique we have o way of kowig how close the kow sample statistic is to the ukow populatio parameter i.e. the approximatio might be good or it might be ot so good. This cocer about the ucertai accuracy or precisio of the poit estimate gives rise to issues of cofidece (or lack of) i its use ad leads ultimately to the employmet of a better method.

Iterval Estimatio Iterval estimatio is a estimatio techique which specifically addresses the precisio ad cofidece issues iheret i the poit estimatio process. It ivolves the costructio of a iterval, cetred aroud a appropriate poit estimate (sample statistic), that we are able to declare, with a prescribed level of cofidece, cotais the associated populatio parameter. Schematically, x - e x x + e p e p p + e µ i this iterval (somewhere) with a π i this iterval (somewhere) with a After prescribed costructig level of cofidece the (C%) iterval estimate prescribed we level of are cofidece able (C%) to say (i geeral terms) that with C% cofidece the populatio mea (or proportio) lies betwee the two values ad (or p e ad p + e). x - e x + e

Iterval Estimatio(cot.) Symbolically we write that; or, With C% cofidece, x - e μ x + e Cofidece iterval estimate of a populatio mea With C% cofidece, p e π p + e Cofidece iterval estimate of a populatio proportio The precisio (or accuracy) of the iterval estimate is provided by the width of the iterval. A wide iterval is ot a very precise estimate. A arrow iterval is more precise. Note that the width of the iterval is 2e. The value e (ot to be cofused with Euler s umber) is sometimes referred to as the error boud. The questio that ow has to be resolved is just how do we costruct a (cofidece) iterval estimate of a populatio mea or a populatio proportio? The aswer is via a formula (i fact oe of three depedig o the situatio).

Iterval Estimatio of the Populatio Mea (µ) There are two situatios to be cosidered here which give rise to two similar, but slightly differet, formulae. Oe of these situatios, however, is ecoutered i practice much more frequetly tha the other ad is the formula that we will evetually cocetrate o. Iterval Estimatio of µ (σ kow) The first sceario cocers itself with the situatio where the stadard deviatio of the populatio that we are tryig to estimate the mea of is kow. This is ot a very likely situatio because, if we do ot kow the value of µ it is highly ulikely that we will kow the value of σ. We will proceed though with this sceario because, although urealistic, it will simplify our itroductio to the cofidece iterval estimatio of µ formulae. Mathematically it ca be show that, with C% cofidece, µ will lie withi the iterval, σ x ± z where z idetifies the positio of the upper boudary of the middle C% area uder the graph (the samplig distributio of sample meas graph). x

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) (cot.) x ± z Note: The formula is actually two formulae with the + ad givig the upper cofidece limit (UCL) ad the lower cofidece limit (LCL), respectively of the C% cofidece iterval estimate of the populatio mea. The formula could be stated alteratively as, With C% cofidece, x - z σ which is of the geeral form, x - e μ, x discussed + e earlier (slide 6). Of the four symbols referred to i the formula, three are quite straight forward ad familiar to us. Remember that we are usig the sample mea ( ) as the atural x startig poit, is the size of the sample used to obtai the sample mea ad σ is the populatio stadard deviatio (assumed kow i this simplified sceario). σ μ x + z σ

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) (cot.) x ± z σ Possibly, the oly questioable etry i the formula is that of z, which x we have defied as markig the upper boudary of the C% middle area uder the graph. The reaso for referece to the x-bar graph arisig because this, i fact, provides the mathematical origi of the formula (beyod the scope of this uit). Note that the factor, σ/, is also the stadard deviatio of the x-bar graph (see the samplig distributio of sample meas theory from last week). C% Schematically, the, we have; µ X (samples size ) Z =?

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) (cot.) Example 1: If a sample of size 36 studets attedig campus reveals a sample mea commutig time of 31.1 miutes, ad if the populatio stadard deviatio commutig time was kow to be 15 miutes, determie a poit estimate ad a 90% cofidece iterval estimate of the populatio mea commutig time. From the problem statemet: Require µ, provided with = 36, = 31.1 (mis), σ = 15 (mis), C% = 90% = 0.9000 Poit estimate ~ µ x x = 31.1 (mis) Recall the simplistic ature of this form of parameter estimatio. How close is this value to the actual value of µ? We have o way of kowig! How cofidet ca we be the i its accuracy?

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) Example 1 (cot.): We require µ, ad are provided with = 36, = 31.1(mis), x σ = 15(mis), C% = 90% = 0.9000 Iterval estimate ~ sice we require µ, with σ kow, we use; σ x ± z C% = 90% 15 = 31.1±1.645 area = 0.9500 36 = 31.1± 4.1 accept 1.64 or 1.65 5% 5% µ X ( = 36) Z = 1.645 (usig stadard ormal tables i reverse) Note: Roudig to 1 decimal place cosistet with the x-bar value. So, with 90% cofidece the populatio mea commutig time lies betwee 27.0 miutes ad 35.2 miutes, or symbolically; With 90% cofidece, 27.0 µ 35.2 (mis) Note that the precisio (or accuracy) of the estimate is give by the width of the iterval (= 8.2 miutes)

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) Example 1 (cot.): Before cocludig with this first example a few additioal poits about the iterval estimate result that we have just obtaied. With 90% cofidece, 27.0 µ 35.2 (mis) Make sure that you appreciate the advatage of the iterval estimate over the rather simplistic poit estimate alterative. I the latter all we were able to coclude was that the populatio mea is about 31.1 miutes. With the iterval estimate we are able to declare, with some cofidece (90%), that the populatio mea will lie betwee 27.0 miutes ad 35.2 miutes. Of course we still do t kow exactly what the populatio mea is (we ever will without surveyig the populatio) but at least we have some reasoably cofidet idea of the limits betwee which it will lie.

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) Example 1 (cot.): With 90% cofidece, 27.0 µ 35.2 (mis) The accuracy/precisio of the estimate, 8.2 miutes, is the width of the iterval. If the iterval is wide the the iterval estimate is ot very precise. If the iterval is arrow the iterval estimate is precise. Clearly the ideal iterval estimate is oe i which the iterval is small ad i which the level of cofidece is high (perhaps, say, 95% rather tha 90%). We will retur to this idea a little later o whe we explore the relatioship betwee the level of cofidece, precisio ad the size of the sample. Note also that we caot say that the probability that the populatio mea lies betwee 27.0 ad 35.2 miutes is 0.90. This is a quite difficult cocept to grasp but cosider this The populatio mea is a fixed value, it does ot vary. We do ot kow what it is but it is costat. So, the ukow populatio mea, will either be i the iterval or it will ot. There is o probability associated with this fact because the populatio mea is ot a variable.

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ kow) Example 1 (cot.): With 90% cofidece, 27.0 µ 35.2 (mis) So, what does the 90% represet? Remember that the above cofidece iterval has bee obtaied by applyig the formula to iformatio obtaied from just oe sample of size 36 (oe with a sample mea of 31.1 miutes). Realise also that we could have selected ay oe of a large umber of samples rather tha the oe that we actually did. Ad, for each oe of those samples we could have costructed, usig the same formula, a 90% cofidece iterval for each. What the 90% represets is the fact that if we selected every possible sample of size 36 from the campus studet populatio (this would be a huge umber of samples) ad determied the 90% iterval estimate for each, 90% of those iterval estimates would actually cotai the populatio mea. The 90% is the probability that the oe sample selected produces a iterval estimate that actually cotais the ukow populatio mea. Practice Problems: Week 7, Q7.1 ad 7.2

Iterval Estimatio of the Populatio Mea (µ) (cot.) Iterval Estimatio of µ (σ ukow, 121) The more realistic sceario with regard costructig a iterval estimate of a ukow populatio mea is whe the populatio stadard deviatio is also ukow. The formula is essetially the same as before i.e. with C% cofidece, µ lies withi the limits; σ x ± z with the problem of ot kowig σ immediately resolved by simply approximatig it by s, the stadard deviatio of the sample from which the required sample mea is obtaied. This should ot be a totally surprisig move because, sice the very itroductio of the sample stadard deviatio (i week 2) we have bee strogly suggestig that it ca be used as a reliable approximatio to a ukow populatio stadard deviatio (recall that this is why we divide by 1 i the formula for sample variace see Week 2, slide 23).

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ ukow, 121) (cot.) So, if σ is ukow, the formula for costructig a C% cofidece iterval estimate of a populatio mea becomes; s sice s σ x ± z So really there is o great issue so far with ot kowig the populatio stadard deviatio. Ufortuately there is a mior complicatio associated with approximatig σ with s ad that is that whe the sample size is small ( 121), the stadard ormal tables used to idetify the upper boudary of the middle C% area uder the x-bar graph (the z i the formula) become a little iaccurate. To overcome this iaccuracy a differet set of tables are used for this purpose. These are rather curiously kow as Studet s t tables, ad although desiged ad read i a much differet way to our familiar stadard ormal tables they provide iformatio about positios uder the ormal curve (i stadard deviatios to the right or the left of the mea) as do ormal tables read i reverse. Try a Google search if iterested i the origi of t tables.

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ ukow, 121) (cot.) Studet s t tables The followig extract is of the Studet s t tables provided Degrees of Critical values of t for upper-tail areas i the class, Tables Freedom 0.25 ad 0.1Formulae 0.05 0.025 booklet. 0.01 0.005 Note that the etries are ot probabilities (how ca you tell?). The etries idetify positios uder a ormal curve i terms of stadard deviatios to the left or right of the mea correspodig to a particular upper tail area. 5 0.7267 1.4759 2.0150 2.5706 3.3649 4.0321 6 0.7176 1.4398 1.9432 2.4469 3.1427 3.7074... 34 0.6818 1.3070 1.6909 2.0322 2.4411 2.7284 35 0.6816 1.3062 1.6896 2.0301 2.4377 2.7238 36 0.6814 1.3055 1.6883 2.0281 2.4345 2.7195... 49 0.6795 1.2991 1.6766 2.0096 2.4049 2.6800 50 0.6794 1.2987 1.6759 2.0086 2.4033 2.6778.. 120 0.6765 1.2866 1.6577 1.9799 2.3578 2.6174 0.6745 1.2816 1.6449 1.9600 2.3263 2.5758 t tables are used i a variety of differet statistical applicatios. I each applicatio the appropriate row to be used is determied by the degrees of freedom (dof) formula for that particular applicatio. For iterval estimatio of a populatio mea the dof formula is 1. Six possible upper tail areas 0 t Note that the very last row of the t tables (labelled ) cotai Z values ad observe that as the sample size icreases, t values get closer ad closer to Z values.

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ ukow, 121) (cot.) So, if σ is ukow ad is small ( 121), with C% cofidece, µ will lie withi the iterval, x ± t- 1 where t -1 idetifies the positio of the upper boudary of the middle C% area uder the x-bar graph. s Why? Remember: t tables (with -1 degrees of freedom i this applicatio) used istead of Z tables to fix the positio of the upper boudary of the middle C% area uder the x-bar graph whe Example 2: Suppose we have the situatio from Example 1 (slide 10) but this time the populatio stadard deviatio is ukow. The sample of size 36 which provides the x-bar value of 31.1 miutes ca readily provide a sample stadard deviatio value (18.5 s is miutes used to approximate see slide 22 from σweek ad 2) as is a small approximatio (because for Z σ. tables Oce agai prove let s to costruct be iaccurate a 90% cofidece uder such iterval estimate of the populatio mea commutig time. circumstaces).

Iterval Estimatio of the Populatio Mea (µ) Iterval Estimatio of µ (σ ukow, 121) Example 2 (cot.): We require µ, ad have = 36, = x31.1(mis), s = 18.5(mis), C% = 90% = 0.9000 Iterval estimate ~ sice we require µ, with σ ukow ad 121, we use; s 5% 5% x ± t-1 C% = 90% 18.5 = 31.1± t35 36 X 18.5 = 31.1±1.6896 36 = 31.1± 5.2 Agai, roudig to 1 decimal place cosistet with the x-bar value. So, with 90% cofidece the populatio mea commutig time lies betwee 25.9 miutes ad 36.3 miutes, or symbolically; With 90% cofidece, 25.9 µ 36.3 (mis) µ ( = 36) t 35 = 1.6896 (Note: t used because Z tables iaccurate uder these circumstaces) Ay cocers about the precisio of this estimate? How does it compare to that i Example 1? What has caused the chage (two reasos)? Practice Problems: Week 7, Q7.3 ad 7.4

Estimatio of the Populatio Proportio (π) (cot.) Poit Estimatio of π As metioed earlier a simplistic estimate of a populatio proportio (relatig to a particular characteristic) is the correspodig sample proportio (relatig to the same characteristic) i.e. π p Iterval Estimatio of π Oly oe formula to be cocered with here. It ca be show that, with C% cofidece, π will lie withi the iterval, p ± z (p 1- p) where z idetifies the positio of the upper boudary of the middle C% area uder the p graph (the samplig distributio of sample proportios graph). Oce agai, the formula is actually two formulae with the + ad givig the upper cofidece limit (UCL) ad the lower cofidece limit (LCL), respectively of the C% cofidece iterval estimate of the populatio proportio.

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π (cot.) The formula could be stated alteratively as, With C% cofidece, p - z (p 1- p) π p + z (p 1- p) which is of the geeral form, p e π p + e, discussed earlier (slide 6). Of the three symbols referred to i the formula, all should be quite familiar to us by ow. Oce agai the correspodig sample statistic (p ~ the sample proportio) is the atural startig poit for the costructio of the cofidece iterval, is the size of the sample used to obtai p ad z we have defied as markig the upper boudary of the C% middle area uder the p graph. This formula is also derived mathematically from the p graph (beyod the scope of this uit) ad you might otice the similarity betwee the factor, (p 1- p) ad the stadard deviatio of the samplig (π 1- π) distributio of sample proportios (p) graph,

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π (cot.) I fact, (p 1- p) (π 1- π) is the best approximatio we have of, with, π (the populatio proportio) ot beig available ~ this is what we are actually tryig to determie is t it. So, schematically, i relatio to the meaig of z i the µ p (samples size ) Z =? formula, we have; C% Note that the ecessary approximatio referred to above (for the stadard deviatio of the p graph) does t cause ay complicatios associated with the use of Z tables i the use of this formula i.e. o eed to use t tables (i fact you must t) whe ivolved with idetifyig the positio of boudaries uder the p graph.

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π (cot.) Example 3: If a sample of size 50 studets attedig the campus reveals a sample proportio of studets that travel by trai of 60%, determie a poit estimate ad a 95% cofidece iterval estimate of the proportio of all studets at the campus that travel by trai. From the problem statemet: We have o way of kowig! Require π, provided with = 50, p = 0.60 ad C% = 95% Poit estimate ~ π 0.60 Oce agai, recall the simplistic ature of this form of parameter estimatio. How close is this value to the actual value of π? How cofidet ca we be the i its accuracy?

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π Example 3 (cot.): We require π, ad are provided with = 50, p = 0.60 ad C% = 95% Iterval estimate ~ sice we require π, we use; (p 1- p) p ± z 0.60 0.40 = 0.60 ±1.96 50 = 0.60 ± 0.14 Use the last row of the t tables this is just why it is provided. rouded to two decimal places cosistet with p C% = 95% 2.5% = 0.025 µ p ( = 36) Z = 1.96 (either usig stadard ormal tables i reverse or more easily from the last row of the t tables) covetioal to roud proportios expressed i decimal form to two decimal places So, with 95% cofidece the populatio proportio of studets at the campus who commute by trai lies betwee 0.46 ad 0.74, or symbolically;

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π Example 3 (cot.): With 95% cofidece, 0.46 π 0.74 What do you thik about the precisio (0.74 0.46 = 0.28) of this estimate? Would you like to improve it? What causes it to be good/bad? What do we have cotrol of here? What about the level of cofidece? Would you prefer to have a higher cofidece level (maybe 97.5%, 99%, could it be 100%)? What chages i the formula if the cofidece level is high? What impact does this have o precisio? How do you couteract these clearly related effects? See over for the resposes to these questios!!!

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π Example 3 (cot.): (p 1- p) p ± z = p ± e What do you thik about the precisio (0.74 0.46 = 0.28) of this estimate? Not very good! Would you like to improve it? Yes! What causes it to be good/bad? e = 0.14 is too large ~ if e was smaller precisio would be improved! What do we have cotrol of here? The value e depeds o z, p ad ~ p is fixed (it comes from whatever sample we have), is largely up to us (see later) ad z is determied by the level of cofidece ~ the higher the level of cofidece the bigger the value of z ad the worse the precisio i.e. higher cofidece leads to lack of precisio (if stays fixed). What about the level of cofidece? Would you prefer to have a higher cofidece level (maybe 97.5%, 99%, could it be 100%)? Yes but we could ever have 100% ~ oly by directly surveyig the populatio ad actually determiig π. Although we would like high cofidece as discussed above, as cofidece icreases, precisio decreases (if stays fixed).

Estimatio of the Populatio Proportio (π) Iterval Estimatio of π Example 3 (cot.): p ± z (p 1- p) = p ± e What chages i the formula if the cofidece level is high? If the cofidece level is high, z is large. What impact does this have o precisio? Precisio is low (because the iterval icreases i size ~ determied by e). How do you couteract these clearly related effects? The oly way to couteract the fact that as the cofidece level icreases, precisio decreases is to icrease the size of the sample (). The fact that appears i the deomiator of the fractio meas that as gets bigger, precisio (determied by e) will improve (e will get smaller). Practice Problems: Week 7, Q7.5 ad 7.6

The relatioship betwee precisio (e), cofidece (C) ad sample size () Geeral Form of the Iterval Estimatio Formula The geeral form of the formula for costructig a e ~ error boud iterval estimate of a populatio mea ad proportio The precisio of ca be represeted as; the estimate is determied by e (half the iterval width) which depeds o both C (z or t) ad sample statisti c x or p critical z or t value σ stadard error of the sample statistic or s ± or (p 1- p) As C, e i.e. as cofidece icreases precisio decreases ( remaiig fixed) As, e i.e. as sample size icreases precisio icreases (C remaiig fixed) Typically, levels of cofidece ad precisio are decided o ad the the size of the sample required to achieve these levels is calculated (beyod the scope of the uit).

Poit ad Iterval Estimates from MS Excel The Descriptive Statistics table obtaied i Week 2 via the Data/Data Aalysis/Descriptive Statistics/Summary Statistics/Cofidece Level for Mea meu/dialogue box optios ca be used to obtai poit ad iterval estimates of populatio meas (ad with some adjustmets populatio proportios). Poit ad Iterval Estimates of Populatio Meas from MS Excel To replicate the workig of Example 2, which dealt with the commutig time data of Week 1, after eterig the 36 data values ito, say, cells A2 to A 37 with, perhaps, the descriptive title Commutig Time (mis), i cell A1 we select the meu optios Data/Data Aalysis/Descriptive Statistics to obtai the Descriptive Statistics dialogue box (see ext slide).

Poit ad Cofidece Iterval Estimates from MS Excel Poit ad Iterval Estimates of Populatio Meas from MS Excel (cot.) to obtai the 90% iterval estimate of the populatio Providig the required dialogue box iformatio Iput Rage, Grouped By Colums, mea Labels commutig First Row, New Worksheet Ply, Summary Statistics ad Cofidece Level for Mea (90%) will produce the Descriptive Statistics table over. time obtaied i Example 2

Poit ad Cofidece Iterval Estimates from MS Excel Poit ad Iterval Estimates of Populatio Meas from MS Excel (cot.) Commutig Time (mis) Mea 31.1 Stadard Error 3.1 x poit estimate of the populatio mea commutig time ~ µ 31.1 (mis) Media 26.5 M d Mode 15 M o Stadard Deviatio 18.5 s Sample Variace 343.25 s 2 Kurtosis -0.6158 Skewess 0.4871 So, with 90% cofidece, Rage 65 Miimum 5 x 31.1 5.2 µ 31.1 + 5.2 S Maximum 70 x L i.e. 25.9 µ 36.3 (mis) Sum 1118 x Cout 36 Cofidece Level(90.0%) 5.2 Note the last uexplaied etry i this table i the secod lie, the Stadard Error, is the best approximatio we have to the stadard deviatio of the samplig distributio of sample meas for samples of size 36 i.e. σ/ s/ = 18.5/ 36 = 3.1 (to 1 d.pl.) error boud (e) for the 90% iterval estimate of the populatio mea commutig time ~ e = 5.2 (mis) The same result as determied maually i Example 2.

Poit ad Cofidece Iterval Estimates from MS Excel (cot.) Poit ad Iterval Estimates of Populatio Proportios from MS Excel MS Excel does ot provide a specific meu optio for obtaiig poit ad iterval estimates of populatio proportios however the previously obtaied Descriptive Statistics table output ca be maipulated to provide such iformatio. I order to do this the sample data has to be coded such that a observatio cosistet with the particular characteristic occurrig is recorded as a 1 ad a observatio cosistet with the characteristic ot occurrig is recorded as a 0. If this is doe the sample mea of the 0 s ad 1 s is idetical to the sample proportio of 1 s (i.e. the sample proportio with the particular characteristic) ad the stadard error of the mea is extremely close to the stadard error of the proportio. Further for a sample of reasoable size the critical t value used for costructig a iterval estimate of a populatio mea is very close to the critical z value used for costructig a iterval estimate of a populatio proportio.

Poit ad Cofidece Iterval Estimates from MS Excel Poit ad Iterval Estimates of Populatio Proportios from MS Excel (cot.) For example, revisitig Example 3 o slide 23; If a sample of size 50 studets attedig the campus reveals a sample proportio of studets that travel by trai of 60%, determie a poit estimate ad a 95% cofidece iterval estimate of the proportio of all studets at the campus that travel by trai. If we record each observatio cosistig of a studet travellig by trai as a 1 (ad those ot as a 0) the the recoded data (with 60% of the sample of size 50 (i.e. 30) beig trai travellers) would cosist of 30, 1 s. For the recoded data the MS Excel Descriptive Statistics table is;

Poit ad Cofidece Iterval Estimates from MS Excel Poit ad Iterval Estimates of Populatio Proportios from MS Excel (cot.) So, π p = 0.60 (poit estimate) Ad, with 95% cofidece, Commute by Trai Mea 0.60 Stadard Error 0.07 Media 1 Mode 1 Stadard Deviatio 0.49 Sample Variace 0.2449 Kurtosis -1.9005 Skewess -0.4210 Rage 1 Miimum 0 Maximum 1 Sum 30 Cout 50 Cofidece Level(95.0%) 0.14 0.60 0.14 π 0.60 + 0.14 i.e. 0.46 π 0.74 Note : σ x Note : x = s 0.49 = = 0.07 50 30 = 50 = 0.60 = p (p 1- p) = x σ p Ad t critical = t 0.025 = 2.0096 (49 dof), with z critical = z 0.025 = 1.96 So this provides us with the error boud (e) for the 95% iterval estimate of the populatio proportio of trai travellers. The same result as determied maually i Example 3.