Topic 10: Introduction to Estimation

Similar documents
MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Parameter, Statistic and Random Samples

Overview of Estimation

KLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions

MATH/STAT 352: Lecture 15

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Topic 8: Expected Values

Computing Confidence Intervals for Sample Data

Confidence Intervals for the Population Proportion p

Simulation. Two Rule For Inverting A Distribution Function

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Random Variables, Sampling and Estimation

(7 One- and Two-Sample Estimation Problem )

Stat 421-SP2012 Interval Estimation Section

Topic 9: Sampling Distributions of Estimators

Statistics 511 Additional Materials

Statisticians use the word population to refer the total number of (potential) observations under consideration

Big Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Module 1 Fundamentals in statistics

Topic 9: Sampling Distributions of Estimators

Topic 18: Composite Hypotheses

Topic 9: Sampling Distributions of Estimators

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Final Examination Solutions 17/6/2010

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Agreement of CI and HT. Lecture 13 - Tests of Proportions. Example - Waiting Times

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Estimation of a population proportion March 23,

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

CHAPTER 2. Mean This is the usual arithmetic mean or average and is equal to the sum of the measurements divided by number of measurements.

Sampling Distributions, Z-Tests, Power

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Analysis of Experimental Data

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Chapter 8: Estimating with Confidence

Confidence Intervals

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

Y i n. i=1. = 1 [number of successes] number of successes = n

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Stat 200 -Testing Summary Page 1

1 Inferential Methods for Correlation and Regression Analysis

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Lecture 2: Monte Carlo Simulation

2 1. The r.s., of size n2, from population 2 will be. 2 and 2. 2) The two populations are independent. This implies that all of the n1 n2

Chapter 6 Sampling Distributions

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

(6) Fundamental Sampling Distribution and Data Discription

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

STATISTICAL INFERENCE

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

UNIT 8: INTRODUCTION TO INTERVAL ESTIMATION

Probability and statistics: basic terms

Simple Random Sampling!

Chapter 6 Principles of Data Reduction

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

This is an introductory course in Analysis of Variance and Design of Experiments.

A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Test of Statistics - Prof. M. Romanazzi


Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

MEASURES OF DISPERSION (VARIABILITY)

Statistics 300: Elementary Statistics

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

AP Statistics Review Ch. 8

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Stat 225 Lecture Notes Week 7, Chapter 8 and 11

Introduction to Econometrics (3 rd Updated Edition) Solutions to Odd- Numbered End- of- Chapter Exercises: Chapter 3

Topic 10: The Law of Large Numbers

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

BIOSTATS 640 Intermediate Biostatistics Frequently Asked Questions Topic 1 FAQ 1 Review of BIOSTATS 540 Introductory Biostatistics

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

ST 305: Exam 3 ( ) = P(A)P(B A) ( ) = P(A) + P(B) ( ) = 1 P( A) ( ) = P(A) P(B) σ X 2 = σ a+bx. σ ˆp. σ X +Y. σ X Y. σ X. σ Y. σ n.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

STAT 155 Introductory Statistics Chapter 6: Introduction to Inference. Lecture 18: Estimation with Confidence

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

1.010 Uncertainty in Engineering Fall 2008

Expectation and Variance of a random variable

Median and IQR The median is the value which divides the ordered data values in half.

Census. Mean. µ = x 1 + x x n n

Confidence Interval for one population mean or one population proportion, continued. 1. Sample size estimation based on the large sample C.I.

Unit 6 Estimation Week #10 - Practice Problems SOLUTIONS

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Activity 3: Length Measurements with the Four-Sided Meter Stick

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Confidence Intervals รศ.ดร. อน นต ผลเพ ม Assoc.Prof. Anan Phonphoem, Ph.D. Intelligent Wireless Network Group (IWING Lab)

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Lecture 1. Statistics: A science of information. Population: The population is the collection of all subjects we re interested in studying.

6. Sufficient, Complete, and Ancillary Statistics

Common Large/Small Sample Tests 1/55

Economics Spring 2015

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Estimation for Complete Data

Transcription:

Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio displayig a particular behavior, the temperature, the mea salary, mea lifetime, the slope ad itercept of a lie? The ext step is to perform a experimet or a sta collectio scheme that is well desiged to estimate oe (or more) umbers However, before we ca embark o such a desig, we must lear some priciples of estimatio to have some uderstadig of the properties of a good estimator ad to preset our ucertaily about the estimatio procedure We begi with a defiitio: Defiitio A statistic is a fuctio of the data that does ot deped o ay ukow quatity We have to this poit, see a variety of statistics Example sample mea, x sample variace, s sample stadard deviatio, s sample media, sample quartiles Q, Q 3 ad percetiles stadardized scores (x i x)/s Poit Estimates Here we will limit ourselves to three estimatio questios: For a simple radom sample X, X,, X havig ukow mea µ, we estimate µ by X, the mea of the sample For Beroulli trils X, X,, X havig ukow success probability p, we estimate p by ˆp, the sample proportio For two idepedet simple radom samples X, X,, X ad Y, Y,, Y havig ukow meas µ X ad µ Y, repectively, we estimate µ X = µ Y by X Ȳ c 0 Joseph C Watkis

Itroductio to Statistical Methodology Itroductio to Estimatio Notice that we have the expected values E X = µ, E ˆp = p, ad E[ X Ȳ ] = µ X µ Y I words, this say that the estimator does ot systematically uderestimate or overestimate the ukow mea, probability, or differece i meas Whe such a idetity hold, we say that the estimator is ubiased For example, for the variace σ, we have see two choices: (x i x) ad i= (x i x) The first of the systematically uderestimates σ, the secod is ubiased Oe criterio for a good estimator is little or o bias The secod is a small variace For the first two examples above, we have the variaces Notice that we have the expected values Var( X) = σ, ad Var(ˆp) = p( p) Thus, the variace of the estimator is proportioal to the variace of a sigle observatio ad iversely proportioal to the umber of observatios For the third example, we have the Pythagorea theorem for variaces Var( X Ȳ ) = Var( X) + ( ) Var(Ȳ )Var = ( X) + Var(Ȳ ) = σ X X + σ Y Y I most circumstaces, the variaces are ukow ad thus we replace them by the stadard deviatio from the data s, ˆp( ˆp), ad s X X + s Y Y The square root of this quatity is kow are the stadard error s, ˆp( ˆp), ad s X X + s Y Y Example 3 Returig to the study of the smokig habits of 5375 high school childre i Tucso i 967 Here is a two-way table summarizig some of the results studet studet sample stadard smokes does ot smoke total proportio error parets smoke 400 380 780 05 000 paret smokes 46 83 39 086 0009 0 parets smoke 88 68 356 039 0008 total 004 437 5375 i=

Itroductio to Statistical Methodology Itroductio to Estimatio Example 4 For rolls of a die, we have the followig summaries stadard type mea deviatio fair 3340 636 weighted 740 56 Thus the estimate for the mea value o a fair die is 3340 The estimate for the mea value o the weighted die is 740 The stadard error of the estimates are 636 = 03 for the fair die, ad 56 = 0 I this case (uusually), we kow the distributioal mea is 30 for the fair die ad 667 for the weighted die Note that these values are withi oe stadard error of the sample meas The estimate of the differece i the meas is 3340 740 = 0600 with a stadard error 636 + 56 = 030 Note here that the mea differece is early twice the stadard error 3 Cofidece Itervals I some sese, this should be sufficiet to describe the estimate ad determie the quality of the estimator by givig the stadard error However, the typical way to describe a estimatio procedure is with a cofidece iterval This is a procedure to determie ad iterval from the data that has a high probability of capturig the true value If this probability is C%, the this is called a C% cofidece itervaltypical value for C% are 95%, 98% ad 99% Lookig at the Tucso data for smokig, we have a sample proportio ˆp = 05 of childre who smoke i households i which parets smoke We might ask what is the proportio p i the etire populatio of childre who smoke i households i which parets smoke We kow that their is a 95% probability that ˆp is withi z = 96 stadard uits of the populatio proportio p Reversig this, we have that the populatio proportio p has a 95% probability of beig withi 96 stadard uits of ˆp I symbols, with a 95% probability p is somewhere i the iterval estimate margi of error ad estimate + margi of error estimate value stadard error ad estimate + value stadard error ˆp z ˆp( ˆp) ad ˆp + z ˆp( ˆp) 05 960 0 ad 05 + 960 0 005 ad 045 Whe the cofidece iterval icludes the mea, we eed to take ito accout that we have made the replacemet of the distributioal variace by the sample variace Thus, the z-statistic z = x µ σ/ 3

Itroductio to Statistical Methodology Itroductio to Estimatio 00 0 0 03 04 00 0 04 06 08 0-4 - 0 4 x -4-0 4 x Figure : The desity ad distributio fuctio for a stadard ormal radom variable (black) ad a t radom variable with 4 degrees of freedom (red) The variace of the t distributio is df/(df ) = 4/(4 ) = is higher tha the variace of a stadard ormal This ca be see i the broader shoulders of the desity fuctio or i the more rapid rise i the distributio fuctio away from the mea of 0 is replaced by the t-statistic t = x µ s/ The remarkable discovery by William Gossett is that the distributio of the t statistic ca be determied exactly However, this statistic depeds o the umber of observatios Thus, we use a table of values for the t- statistics The so-called degrees of freedom is oe fewer tha the umber of observatios For a 95% cofidece iterval the value for 49 degrees of freedom is 00 This is slightly larger tha the value 960 for the correspodig value for the ormal distributio Thus, the 95% cofidece iterval for the fair die is estimate margi of error ad estimate + margi of error estimate value stadard error ad estimate + value stadard error x t s ad x + t s 3340 00 03 ad 3340 + 00 03 776 ad 3704 Example 5 A radom sample of legths of movies, i miutes, durig a Jue weeked is give below 0 3 9 37 5 30 4 90 96 0 07 94 90 96 84 6 97 For these data, the mea x = 088 miutes ad the stadard deviatio is s = 567 miutes Thus, the stadard error of the mea is 567/ = 34 miutes For a 95% cofidece iterval, the value t = 086 Thus 4

Itroductio to Statistical Methodology Itroductio to Estimatio the cofidece iterval mea legth, i miutes, of movies is 088 ± 086 34 = (068, 594) 4 Summary of Stadard Cofidece Itervals The cofidece iterval is a extesio of the idea of a poit estimatio of the parameter to a iterval that is likely to cotai the true parameter value A level C cofidece iterval for a populatio parameter is a iterval computed from the sample data havig probability C of producig a iterval cotaiig the true parameter value For a estimate of a populatio mea or proportio, a level C cofidece iterval ofte has the form estimate ± t stadard error where t is the upper C value for the t distributio with the appropriate umber of degrees of freedom If the umber of degrees of freedom is ifiite, we use the stadard ormal distributio to detemie the value, usually deoted by z The margi of error m = t stadard error decreases if C decreases the stadard deviatio decreases icreases The procedures for fidig the cofidece iterval are summarized i the table below procedure parameter estimate stadard error degrees of freedom oe sample µ x s two sample µ µ x x s + s ˆp( ˆp) oe proportio p ˆp two proportio p p ˆp ˆp ˆp ( ˆp ) + ˆp ( ˆp ) mi{, } 5