Confidence Intervals and Hypothesis Tests

Similar documents
EC2001 Econometrics 1 Dr. Jose Olmo Room D309

A Primer on Statistical Inference using Maximum Likelihood

Computational Perception. Bayesian Inference

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Nonparametric hypothesis tests and permutation tests

Introductory Econometrics. Review of statistics (Part II: Inference)

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

One-sample categorical data: approximate inference

Conditional probabilities and graphical models

Gov Univariate Inference II: Interval Estimation and Testing

Topic 15: Simple Hypotheses

Harvard University. Rigorous Research in Engineering Education

Mathematical Induction

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

Data Analysis and Statistical Methods Statistics 651

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introductory Econometrics

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

Introduction to Bayesian Learning. Machine Learning Fall 2018

One important way that you can classify differential equations is as linear or nonlinear.

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Loglikelihood and Confidence Intervals

appstats27.notebook April 06, 2017

Induction 1 = 1(1+1) = 2(2+1) = 3(3+1) 2

An Introduction to Laws of Large Numbers

(1) Introduction to Bayesian statistics

COMP2610/COMP Information Theory

Statistical Inference. Hypothesis Testing

Hypothesis testing (cont d)

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Chapter 26: Comparing Counts (Chi Square)

Lectures 5 & 6: Hypothesis Testing

Lecture 30. DATA 8 Summer Regression Inference

Lecture 4: September Reminder: convergence of sequences

Stat 206: Estimation and testing for a mean vector,

Basic Probability Reference Sheet

2008 Winton. Statistical Testing of RNGs

Topic 3: Hypothesis Testing

1 Hypothesis testing for a single mean

Scribe to lecture Tuesday March

ECO375 Tutorial 4 Introduction to Statistical Inference

Section 3.1: Direct Proof and Counterexample 1

Confidence Intervals

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Summer HSSP Lecture Notes Week 1. Lane Gunderman, Victor Lopez, James Rowan

A sequential hypothesis test based on a generalized Azuma inequality 1

The paradox of knowability, the knower, and the believer

Relating Graph to Matlab

STA121: Applied Regression Analysis

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Mathematical Statistics

Math 475, Problem Set #8: Answers

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

Discrete Distributions

STA Module 10 Comparing Two Proportions

Introduction to Econometrics

Simple Linear Regression for the Climate Data

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Lecture 11 - Tests of Proportions

Problems from Probability and Statistical Inference (9th ed.) by Hogg, Tanis and Zimmerman.

Statistics for Managers Using Microsoft Excel/SPSS Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests

Hypothesis Testing with Z and T

Inference in Regression Analysis

Institute of Actuaries of India

Preliminary Statistics. Lecture 5: Hypothesis Testing

14.30 Introduction to Statistical Methods in Economics Spring 2009

Statistical inference

Midterm, Fall 2003

Chapter 24. Comparing Means

MS&E 226: Small Data

STEP 1: Ask Do I know the SLOPE of the line? (Notice how it s needed for both!) YES! NO! But, I have two NO! But, my line is

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

A FLOW DIAGRAM FOR CALCULATING LIMITS OF FUNCTIONS (OF SEVERAL VARIABLES).

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Chapter Three. Hypothesis Testing

Chapter 27 Summary Inferences for Regression

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

The problem of base rates

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Statistics for IT Managers

Section 1.x: The Variety of Asymptotic Experiences

Conceptual Explanations: Simultaneous Equations Distance, rate, and time

Lecture 10: Powers of Matrices, Difference Equations

ECON Introductory Econometrics. Lecture 2: Review of Statistics

The Components of a Statistical Hypothesis Testing Problem

Slides for Data Mining by I. H. Witten and E. Frank

Performance Evaluation and Comparison

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Multiple Linear Regression for the Salary Data

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Physics 6720 Introduction to Statistics April 4, 2017

Bias Variance Trade-off

Practice Problems Section Problems

Transcription:

Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes. In particular, if X 1,, Xn are independent and identically distributed (iid) with mean E[Xi]=µ and variance V[Xi]=σ 2 AND n is large, then ( ) For the remainder of this handout, all sample sizes should be assumed to be greater than 30 so this result holds. A special case of this result is that if X 1,, Xn Bern(p), then ( ) Results for two samples may be obtained using the formulas for linear combinations of normal distributions. Thus, if are iid with mean and variance while are iid with mean and variance (and of course the X and Y samples are independent), then ( ) This also has a special case for proportions. If and, then ( ) These formulas are the four fundamental results that motivate all of the confidence interval and hypothesis testing theory we will investigate in this course. 2 What are Confidence Intervals and Hypothesis Tests? Inference is the use of data to draw conclusions about population parameters. Probability theory assumes we have X 1,, Xn Bern(0.4) and then specifies the likelihood of generating 0 through n successes. Thus probability theory assumes we know the parameter p and specifies how our data should appear. Inference is concerned about the reverse problem. We already have X 1,, Xn Bern(p), but we don t know p. Our goal is to use the data to determine p. The first thing to note is that we will NEVER be able to determine p exactly with only a finite amount of data. Suppose n=1000 and we observe that X 1,, Xn Bern(p) has 800 successes. What is p? Unfortunately, no value of p in (0,1) can be completely excluded based on this data. It is possible to see the observed data when p=0.01 (not likely, but possible). For any value of p in (0,1), the observed value is possible. Thus, we are forced to make probabilistic statements about p. Intuitively, while p=0.01 cannot be excluded, our observed data (800 successes in 1000 trials) is so unlikely when p=0.01 that for all practical purposes we can exclude p=0.01. These are the kind of inferences we will pursue.

We will focus on making two kinds of inferences, confidence intervals and hypothesis tests in several scenarios. Not coincidentally, these scenarios correspond to the situations where we applied the central limit theorem in section 1. Specifically, we will make inferences on means for one or two samples, and on proportions for one or two samples. The two kinds of inferences correspond to two common questions asked in scientific experiments. The first, confidence intervals, answers the question I have no idea what µ (p) is, how do I use the data to estimate it? The second, hypothesis tests, answers the question I have a specific value of µ (p) in mind. Is the data consistent with that particular value of µ (p)? 3 Point Estimates (our best guess) Fundamental to answering both these questions is the notion of a point estimate. A point estimate takes the observed data and produces a single value (or guess) of the parameter. Returning to our example where we had 1000 Bernoulli trials and observed 800 successes, we have already established we are not pleased with p=0.01. If we had to guess a single number, what would we guess? The most common choice is, which in this example is 800/1000=0.8. This guess is justified by the central limit theorem, which states that the expected value of is p. While may not be equal to p in any particular situation, has a distribution that is centered around the true value. Thus, if we are estimating a proportion p, we estimate it with. For a mean µ, use. These extend to the two sample case, so the difference of two proportions is estimated by and the difference of two means is estimated by. Not coincidentally, the center of the distributions of all these guesses is the quantity we are trying to guess. 4 Confidence Intervals Our best guess is a good start for inference, but it isn t ideal. Specifically, our best guess is basically guaranteed to be wrong. If X 1,, Xn N(0,1), then N(0,1), which is a continuous distribution. Although the distribution of is centered at µ=0, the probability that will exactly equal 0 is 0. OK, that doesn t sound great, but it s not terrible. While might not be exactly correct, its key advantage is that it should be close to µ, and the larger the sample size, the closer to µ it should be (this can be observed by noting the variance of, σ 2 /n, tends to 0 as n increases). In fact, the central limit theorem allows us to quantify just how close our point estimate should be to the correct answers. In general, a confidence interval is Thus, for each situation, the only thing to do is find the best guess, and then use the central limit theorem to compute the standard deviation of that best guess.

4.1 Formulas 4.1.1 Single Proportion We have X 1,, Xn Bern(p). The best guess of p is. Looking at the central limit theorem result, the variance of is p(1-p)/n. This is an obvious difficulty, since p is unknown (it is what we are trying to estimate!). However, it turns out that is a sufficiently good guess of p that we can replace p with in the variance, resulting in the confidence interval 4.1.2 Single Means We have X 1,, Xn iid with mean µ=e[xi] and variance σ 2 =V[Xi]. The best guess of µ is, which has variance σ 2 /n, resulting in the interval If σ 2 is unknown, then it must be estimated from the data. It turns out that s 2, defined as [ ] is a reasonable guess of σ 2, and thus should be used in place of σ 2 when necessary. 4.1.3 Difference between two proportions We have and, and are interested in estimating. The best guess of is. The variance of this best guess depends on the unknown quantities px and py, but as with a single proportion these can be replaced with and in the variance, resulting in the interval 4.1.4 Difference between two means We have iid with mean µx and variance, and iid with mean µy and variance, and are interested in estimating. The best guess of is. Using the central limit theorem to find the variance of this best guess, we find the confidence interval is As with estimating a single mean, replace with and with as necessary.

5 Hypothesis Tests When we have a specific value of the parameter in mind and want to verify whether that parameter value is reasonable for the data, we use a hypothesis test. The specific value of the parameter we have in mind is recorded in the null hypothesis, H0, which might state p=0.2, or µ=5, or =3. The point is that a specific value of the parameter is chosen. A hypothesis test is conducted by observing the difference between our best guess of the parameter and the null value (the value specified in the null hypothesis). This difference must then be standardized. The standardization consists of finding the standard deviation of the best guess under the assumption H0 is true. This results in the test statistic The test statistic merely measures how many standard deviations the best guess is from the null value. If the best guess is too far away, the null hypothesis is rejected, otherwise the null hypothesis is accepted. Too far away in this context is determined by. We reject H0 if or if. Otherwise we do not reject H0. Note that when H0 is true, we have constructed a procedure that rejects H0 with probability α. Thus, we can control the probability of falsely rejecting H0. 5.1 Formulas 5.1.1 Single Proportion Suppose we have X 1,, Xn Bern(p) and are testing H0: p=p0. Our best guess of p is. When H0 is true,, thus the test statistic is 5.1.2 Single Mean Suppose we have X 1,, Xn iid with mean µ and variance σ 2, and we are testing H0: µ=µ0. The best guess of µ is, which under the null hypothesis is distributed N(µ=µ0, σ 2 /n). Thus the test statistic is As with confidence intervals, replace σ 2 with s 2 if the variance is unknown.

5.1.3 Difference between two proportions We have and, and are interested in testing H0: =0. There is one trick to this. We have to compute the standard deviation of under the assumption that =0. We cannot just plug in and into the usual variance formula, because it may not be true that =0. We use where 5.1.4 Difference between two means We have iid with mean µx and variance, and iid with mean µy and variance, and are interested in testing H0: =d0 against H1: d0. The most common instance of this occurs when d0=0, when the null hypothesis simplifies to H0:. Our best guess of is, and thus the test statistic is where and should be replaced with and as necessary.