The variable θ is called the parameter of the model, and the set Ω is called the parameter space.

Similar documents
Chapter 8 - Statistical intervals for a single sample

ST 371 (IX): Theories of Sampling Distributions

Chapter 8: Confidence Intervals

Hypothesis Testing: One Sample

Confidence Intervals for the Sample Mean

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

What Is a Sampling Distribution? DISTINGUISH between a parameter and a statistic

MS&E 226: Small Data

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

BIOS 6649: Handout Exercise Solution

Ch. 7 Statistical Intervals Based on a Single Sample

Confidence intervals CE 311S

7.1 Basic Properties of Confidence Intervals

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Maximum-Likelihood Estimation: Basic Ideas

EXAM 3 Math 1342 Elementary Statistics 6-7

Lecture 6: Point Estimation and Large Sample Confidence Intervals. Readings: Sections

Business Statistics: A First Course

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

AP Stats MOCK Chapter 7 Test MC

Confidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean

Chapter 6. Estimates and Sample Sizes

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Point Estimation and Confidence Interval

σ. We further know that if the sample is from a normal distribution then the sampling STAT 2507 Assignment # 3 (Chapters 7 & 8)

Statistical Intervals (One sample) (Chs )

Statistical inference

Unit 1: Statistics. Mrs. Valentine Math III

Statistics for Business and Economics

The point value of each problem is in the left-hand margin. You must show your work to receive any credit, except in problem 1. Work neatly.

1 MA421 Introduction. Ashis Gangopadhyay. Department of Mathematics and Statistics. Boston University. c Ashis Gangopadhyay

Probability and Statistics

STA 291 Lecture 16. Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately) normal

Inferences Based on Two Samples

Statistics for IT Managers

Estimation and Confidence Intervals

Why Sample? Selecting a sample is less time-consuming than selecting every item in the population (census).

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AP Statistics Chapter 7 Multiple Choice Test

ECO220Y Simple Regression: Testing the Slope

Statistics for Managers Using Microsoft Excel 5th Edition

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Statistics and Quantitative Analysis U4320. Segment 5: Sampling and inference Prof. Sharyn O Halloran

Reducing Computation Time for the Analysis of Large Social Science Datasets

The Purpose of Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing

Chapter 9 Inferences from Two Samples

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Chapter 5: HYPOTHESIS TESTING

Statistics 135 Fall 2007 Midterm Exam

Chapter 12: Inference about One Population

Practice Problems Section Problems

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

Practice Questions: Statistics W1111, Fall Solutions

Probabilities & Statistics Revision

Simple and Multiple Linear Regression

Statistical Inference

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Estimation of Parameters

What is a parameter? What is a statistic? How is one related to the other?

STAT Chapter 8: Hypothesis Tests

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. describes the.

What is a parameter? What is a statistic? How is one related to the other?

Solutions to Practice Test 2 Math 4753 Summer 2005

D. A 90% confidence interval for the ratio of two variances is (.023,1.99). Based on the confidence interval you will fail to reject H 0 =!

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Lecture 22/Chapter 19 Part 4. Statistical Inference Ch. 19 Diversity of Sample Proportions

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Statistical Inference

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

Lectures on Simple Linear Regression Stat 431, Summer 2012

Introduction to Statistical Data Analysis Lecture 4: Sampling

Swarthmore Honors Exam 2012: Statistics

3 Conditional Probability

Business Statistics: A Decision-Making Approach 6 th Edition. Chapter Goals

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement

Math 2200 Fall 2014, Exam 3 You may use any calculator. You may use a 4 6 inch notecard as a cheat sheet.

Single Sample Means. SOCY601 Alan Neustadtl

Statistics, continued

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

AP Statistics Review Ch. 7

Confidence Intervals for the Mean of Non-normal Data Class 23, Jeremy Orloff and Jonathan Bloom

MAT2377. Ali Karimnezhad. Version December 13, Ali Karimnezhad

Midterm Exam 2 Answers

Probability and Statistics

Lecture 7: Hypothesis Testing and ANOVA

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

Econ 325: Introduction to Empirical Economics

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

AP Statistics - Chapter 7 notes

QUIZ 4 (CHAPTER 7) - SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

Review. December 4 th, Review

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

1 Conditional Probabilities

6. CONFIDENCE INTERVALS. Training is everything cauliflower is nothing but cabbage with a college education.

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Sections 7.1 and 7.2. This chapter presents the beginning of inferential statistics. The two major applications of inferential statistics

Margin of Error. What is margin of error and why does it exist?

Inference for Proportions, Variance and Standard Deviation

Parametric Techniques

Transcription:

Lecture 8 What is a statistical model? A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced the data. The variable θ is called the parameter of the model, and the set Ω is called the parameter space. From the definition of a statistical model, we see that there is a unique value, such that is the true distribution that generated the data. We refer to this value as the true parameter value. Example: Suppose we have observations of heights in cm of individuals in a population and we feel that it is reasonable to assume that the distribution of height of the population is normal with some unknown mean and variance. The statistical model in this case is Goals of Statistics: Estimate unknown parameters of underlying probability distribution. Measure errors of these estimates. Test whether data gives evidence that parameters are (or are not) equal to a certain value or that the probability distribution has a particular form.

Point Estimation Most statistical procedures involve estimation of the unknown value of the parameter of the statistical model. A point estimator of the parameter θ is a function of the underlying random variables and so it is a random variable with a distribution function. A point estimate of the parameter θ is a function of the data; it is a statistic. For a given sample an estimate is a number. Notation: Desirable properties of a point estimator: Unbiased Consistent Minimum variance With known probability distribution Definition: Let be a point estimator for a parameter θ. Then is an unbiased estimator if ( ). Note: There may not always exist an unbiased estimator for θ. Unbiased for θ, does not mean unbiased for g(θ).

Example (of unbiased estimator): The sample mean is an unbiased estimator of the population mean. If ( ), is called biased. Definition: The bias of a point estimator is given by ( ) ( ). Definition: The mean square error of a point estimator is ( ) [( ) ].

Note: ( ) ( ) [ ]. Proof: Example: Suppose ( ) ( ), ( ), ( ). Consider. (a) Show that is an unbiased estimator for ; (b) If and are independent, how should the constant a be chosen to minimize the variance of? Solution:

Examples of Unbiased Point Estimators We denote by the variance of the sampling distribution of the estimator, is called the standard error of the estimator.

Claim: Let be a random sample of size n from a population with mean µ and variance. Then the sample variance is an unbiased estimator of the population variance, but is a biased estimator of. Proof:

Goodness of Point Estimator Definition: The error of estimation is the distance between an estimator and its target parameter. Suppose is an unbiased estimator of and has a sampling distribution. Select a number b and consider.

Example: A sample of n = 1000 voters, randomly selected from a city, showed y = 560 in favor of candidate Jones. Estimate p, the fraction of voters in the population favouring Jones, and place a 2- standard-error bound on the error of estimation. Solution: Example: (#8.24) Results of a public opinion poll reported on the Internet indicated that 69% of respondents rated the cost of gasoline as a crisis or major problem. The article states that 1001 adults, age 18 or older, were interviewed and that the results have a sampling error of 3%. How was the 3% calculated, and how should it be interpreted? Can we conclude that a majority of the individuals in the 18+ age group felt that cost of gasoline was a crisis or major problem? Solution:

Confidence Intervals A point estimate provides no information about the precision and reliability of estimation. For example, the sample mean is a point estimate of the population mean μ but because of sampling variability, it is virtually never the case that. A point estimate says nothing about how close it might be to μ. An alternative to reporting a single sensible value for the parameter being estimated is to calculate and report an entire interval of plausible values a confidence interval (CI). Properties of the interval: - It contains true parameter ; - It is relatively narrow. The upper and lower endpoints of a CI are called the upper and lower confidence limits. The probability that a CI will enclose coefficient, denoted by. is called the confidence Definition: A confidence interval for a parameter is a random interval such that [ ] regardless of the value of.

A confidence level is a measure of the degree of reliability of a confidence interval. It is denoted as 100(1-α)%. The most frequently used confidence levels are 90%, 95% and 99%. The higher the confidence level, the more strongly we believe that the true value of the parameter being estimated lies within the interval. Deriving a Confidence Interval Suppose are a random sample and we observed the data which are the realization of these random variables. We want a CI for some parameter θ. Pivotal method: To derive this CI we need to find another random variable that is typically a function of the estimator of θ satisfying: 1) It depends on and θ 2) Its probability distribution does not depend on θ or any other unknown parameter. Such a random variable is called a pivot.

Example: Suppose we are to obtain a single observation Y~Exp(θ). Use Y to form a CI for θ with confidence coefficient 0.90, or 90% confidence level. Solution:

Example: { Show that is a pivotal quantity. Use it to find a 90% lower confidence limit for θ. Solution:

Large-Sample Confidence Intervals Example: Let be a statistic ~. Find a confidence interval for with a confidence coefficient. Solution:

Example: (#8.56) In a survey of n = 800 randomly chosen adults, 45% indicated that movies were getting better whereas 43% indicated that movies were getting worse. (a) Find a 98% CI for p, the overall proportion of adults who say that movies are getting better. (b) Does the interval include the value p = 0.50? Do you think that a majority of adults say that movies are getting better? Solution:

Width and Precision of CI: The precision of an interval is conveyed by the width of the interval. If the confidence level is high and the resulting interval is quite narrow, the interval is more precise (i.e., our knowledge of the value of the parameter is reasonably precise). A very wide CI implies that there is a great deal of uncertainty concerning the value of the parameter we are estimating. Note: Confidence intervals do not need to be central, any a and b that solve ( the population mean μ. ) define 100(1-α)% CI for Example: The National Student Loan Survey collected data about the amount of money that borrowers owe. The survey selected a random sample of 1280 borrowers who began repayment of their loans between four to six months prior to the study. The mean debt for the selected borrowers was $18,900 and the standard deviation was $49,000. Find a 95% for the mean debt for all borrowers. Solution:

Interval Estimation of Variability In many case we will be interested in making inference about the population variance. Theorem: Let be a random sample from a normal distribution with mean and variance. Then Proof:.

Now let s derive a CI for :

Example: An experimenter wanted to check the variability of measurements obtained by using equipment designed to measure the volume of an audio source. Three independent measurements recorded by this equipment for the same sound were 4.1, 5.2, and 10.2. Estimate with confidence coefficient 0.90. Solution:

The t distribution Definition: Let Z be a standard normal random variable and let X be an independent chi-squared random variable with n degrees of freedom. The random variable is said to follow a t distribution with n degrees of freedom. Theorem: Let be a random sample from a normal distribution with mean and variance. Then, Proof:

CI for μ when σ is unknown Suppose are random sample from a normal distribution with mean and variance, where both μ and σ are unknown. If is unknown we can estimate it by and use the distribution. A 100(1-α)% confidence interval for μ in this case is Example: A manufacturer of gunpowder has developed a new powder, which was tested in 8 shells. The resulting muzzle velocity (ft/sec): 3005 3925 2935 2965 2995 3005 2939 2905 Find a 95% CI for the true average velocity for shells of this type. Assume that velocities ~ appr. Normal. Solution: