Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Similar documents
SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

Fundamental Probability and Statistics

The Logit Model: Estimation, Testing and Interpretation

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Introductory Econometrics. Review of statistics (Part II: Inference)

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

Lecture 8. October 22, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Institute of Actuaries of India

Econ 325: Introduction to Empirical Economics

Ling 289 Contingency Table Statistics

6.4 Type I and Type II Errors

Inference and Regression

. Find E(V ) and var(v ).

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Statistical Data Analysis Stat 3: p-values, parameter estimation

Point Estimation. Vibhav Gogate The University of Texas at Dallas

Topic 19 Extensions on the Likelihood Ratio

Exam 2 Practice Questions, 18.05, Spring 2014

INTERVAL ESTIMATION AND HYPOTHESES TESTING

18.05 Practice Final Exam

Robustness and Distribution Assumptions

COMP2610/COMP Information Theory

Probability and Statistics Notes

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

F79SM STATISTICAL METHODS

Overview. Confidence Intervals Sampling and Opinion Polls Error Correcting Codes Number of Pet Unicorns in Ireland

Visual interpretation with normal approximation

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Maximum-Likelihood Estimation: Basic Ideas

Chapters 10. Hypothesis Testing

Statistical Inference

Statistical Methods for Astronomy

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Statistical Distribution Assumptions of General Linear Models

Inferences About Two Proportions

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Rank-Based Methods. Lukas Meier

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Probability & Statistics - FALL 2008 FINAL EXAM

MS&E 226: Small Data

The PAC Learning Framework -II

Bias Variance Trade-off

STAT 285 Fall Assignment 1 Solutions

Slides for Data Mining by I. H. Witten and E. Frank

Probability Theory for Machine Learning. Chris Cremer September 2015

Lecture 10: Generalized likelihood ratio test

Chapter 3: Probability 3.1: Basic Concepts of Probability

Confidence Intervals, Testing and ANOVA Summary

Sampling, Confidence Interval and Hypothesis Testing

Quantitative Analysis and Empirical Methods

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Statistics

ECO220Y Review and Introduction to Hypothesis Testing Readings: Chapter 12

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Problems ( ) 1 exp. 2. n! e λ and

LECTURE 5 HYPOTHESIS TESTING

Just Enough Likelihood

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks

Topic 15: Simple Hypotheses

Chapter 10. Hypothesis Testing (I)

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Summary of Chapters 7-9

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Composite Hypotheses and Generalized Likelihood Ratio Tests

TUTORIAL 8 SOLUTIONS #

CSE 312 Final Review: Section AA

Bayesian Methods: Naïve Bayes

Lecture 2 Sep 5, 2017

Introduction to Bayesian Statistics

Naïve Bayes classification

POLI 443 Applied Political Research

Evaluating Classifiers. Lecture 2 Instructor: Max Welling

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

MAT 271E Probability and Statistics

Point Estimation. Maximum likelihood estimation for a binomial distribution. CSE 446: Machine Learning

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Probability Density Functions and the Normal Distribution. Quantitative Understanding in Biology, 1.2

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

Hypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis

Advanced Herd Management Probabilities and distributions

Example. χ 2 = Continued on the next page. All cells

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

14.30 Introduction to Statistical Methods in Economics Spring 2009

Lecture 35: December The fundamental statistical distances

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Swarthmore Honors Exam 2012: Statistics

MS&E 226: Small Data

Discrete Distributions

[y i α βx i ] 2 (2) Q = i=1

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

STA Module 10 Comparing Two Proportions

STAT 4385 Topic 01: Introduction & Review

Review. December 4 th, Review

Math 494: Mathematical Statistics

STAT 461/561- Assignments, Year 2015

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Transcription:

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu October 13, 2012

Outline Hypothesis Testing 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

Outline Hypothesis Testing 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

Hypothesis Testing: Basic Idea Null hypothesis: What we believe in the absence of further evidence, e.g. a two-sided coin is fair with equal likelihood. Think: Null hypothesis = default assumption. Two kinds of testing: There is only the null hypothesis, and we accept or reject it. There is a null as well as an alternate hypothesis, and we choose one or the other. The second kind of testing is easier: We choose whichever hypothesis is more likely under the data. The first kind of testing is harder.

Choosing Between Alternatives: Example We are given a coin. The null hypothesis is that the coin is fair with equal probabilities of heads and tails. Call it H 0. The alternative hypothesis is that the coin is biased with the probability of heads equal to 0.7. Call it H 1. Suppose we toss the coin 20 times and 12 heads result. Which hypothesis should we accept?

Choosing Between Alternatives: Example (Cont d) Let n = 20 (number of coin tosses), k = 12 (number of heads), p 0 = 0.5 (probability of heads under hypothesis H 0 ) and P 1 = 0.7 (probability of heads under hypothesis H 1 ). The likelihood of the observed outcome under each hypothesis is computed. ( ) 20 L 0 = (p 12 0 ) 12 (1 p 0 ) 8 = 0.1201, L 1 = ( 20 12 ) (p 1 ) 12 (1 p 1 ) 8 = 0.1144. So we accept hypothesis H 0, that the coin is fair, but only because the alternative hypothesis is even less likely!

Connection to MLE We choose the hypothesis that the coin is fair only because the alternate hypothesis is even more unlikely! So what is the value of p that maximizes ( ) 20 L = p 12 (1 p) 8? 12 Answer: p MLE = 12/20 = 0.6, the fraction of heads observed. With MLE (maximum likelihood estimation), we need not choose between two competing hypotheses MLE gives the most likely values for the parameters!

Outline Hypothesis Testing 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

Estimating Probabilities of Binary Outcomes Suppose an event has only two outcomes, e.g. coin toss. Let p equal the true but unknown probability of success, e.g. that the coin comes up heads. After n trials, suppose k successes result. Then ˆp := k/n is called the empirical probability of success. As we have seen, it is also the maximum likelihood estimate of p. Question: How close is the empirical probability ˆp to the true but unknown probability p? Hoeffding s inequalities answer this question.

: Statements Let ɛ > 0 be any specified accuracy. Then Pr{ˆp p ɛ} exp( 2nɛ 2 ). Pr{ˆp p ɛ} exp( 2nɛ 2 ). Pr{ ˆp p ɛ} 1 2 exp( 2nɛ 2 ).

: Interpretation Interpretations of Hoeffding s inequalities: With confidence 1 2 exp( 2nɛ 2 ), we can say that the true but unknown probability p lies in the interval (ˆp ɛ, ˆp + ɛ). As we increase ɛ, the term δ := 2 exp( 2nɛ 2 ) decreases, and we can be more sure of our interval. The widely used 95% confidence interval corresponds to δ = 0.5. The one-sided inequalities have similar interpretations.

An Example of Applying Hoeffding s Inequality Suppose we toss a coin 1000 times and it comes up heads 552 times. How sure can we be that the coin is biased? n = 1000, k = 552, ˆp = 0.552. If p > 0.5 then we can say that the coin is biased. So let ɛ = ˆp p = 0.052. Compute δ = exp( 2nɛ 2 ) = 0.0045 So with confidence 1 δ = 0.9955, we can say that p > 0.5. In other words, we can be 99.55% sure that the coin is biased. Using the two-sided Hoeffding inequality, we can be 99.1% sure that ˆp (0.5, 0.614).

Another Example An opinion poll of 750 voters (ignoring don t know s) shows that 387 will vote for candidate A and 363 will vote for candidate B. How sure can we be that candidate A will win? Let p denote the true but unknown fraction of voters who will vote for A, and ˆp = 387/750 = 0.5160 denote the empirical estimate of p. If p < 0.5 then A will lose. So the accuracy ɛ = 0.0160, and the number of samples n = 750. The one-sided confidence is δ = exp( 2nɛ 2 ) = 0.6811. So we can be only 1 δ 32% sure that A will win. In other words, the election cannot be called with any confidence based on such a small margin of preference.

Relating Confidence, Accuracy and Number of Samples For the two-sided Hoeffding inequality, the confidence δ associated with n samples and accuracy ɛ is given by δ = 2 exp( 2nɛ 2 ). We can turn this around and ask: Given an empirical estimate ˆp based on n samples, what is the accuracy corresponding to a given confidence level δ? Solving the above equation for ɛ in terms of δ and n gives ɛ(n, δ) = ( 1 2n log 2 ) 1/2. δ So with confidence δ we can say that the true but unknown probability p is in the interval [ˆp ɛ(n, δ), ˆp + ɛ(n, δ)].

for More Than Two Outcomes Suppose a random experiment has more than two possible outcomes (e.g. rolling six-sided die). Say there are k outcomes, and in n trials, the i-th outcome appears n i times (and of course k i=1 n i = n). We can define ˆp i = n i, i = 1,..., k, n and as we have seen, these are the maximum likelihood estimates for each probability. Question: How good are these estimates?

More Than Two Outcomes 2 Fact: For any sample size n and any accuracy ɛ, it is the case that Pr{max ˆp i p i > ɛ} 2k exp( 2nɛ 2 ). i So with confidence 1 2k exp( 2nɛ 2 ), we can assert that every empirical probability ˆp i is within ɛ of the correct value.

More Than Two Outcomes: Example Suppose we roll a six-sided die 1,000 times and get the outcomes 1 through 6 in the following order: ˆp 1 = 0.169, ˆp 2 = 0.165, ˆp 3 = 0.166, ˆp 4 = 0.165, ˆp 5 = 0.167, ˆp 6 = 0.168. With what confidence can we say that the die is not fair, that is, that ˆp i 1/6 for all i?

More Than Two Outcomes: Example (Cont d) Suppose that indeed the true probability is p i = 1/6 for all i. Then max ˆp i p i = ˆp 1 1/6 0.0233. i Take ɛ = 0.233, n = 1000 and compute δ = 6 2 exp( 2nɛ 2 ) 11.87! How can a probability be greater than one? Note: This δ is just an upper bound for Pr{max i ˆp i p i > ɛ}; so it can be larger than one. So we cannot rule out the possibility that the die is fair (which is quite different from saying that it is fair).

Outline Hypothesis Testing K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

Outline Hypothesis Testing K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

K-S Tests: Problem Formulations K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements There are two widely used tests. They should be called the Kolmogorov test and the Smirnov test, respectively. Unfortunately the erroneous names one-sample K-S test and two-sample K-S test have become popular. Kolmogorov Test, or One-Sample K-S Test: We have a set of samples, and we have a candidate probability distribution. Question: How well does the distribution fit the set of samples? Smirnov Test, or Two-Sample K-S Test: We have two sets of samples, say x 1,..., x n and y 1,..., y m. Question: How sure are we that both sets of samples came from the same (but unknown) distribution?

Outline Hypothesis Testing K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

Empirical Distributions K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements Suppose X is a random variable for which we have generated n i.i.d. samples, call them x 1,..., x n. Then we define the empirical distribution of X, based on these observations, as follows: ˆΦ(a) = 1 n n i=1 I {xi a}, where I denotes the indicator function: I = 1 if the condition below is satisfied and I = 0 otherwise. So in this case ˆΦ(a) is just the fraction of the n samples that are a. The diagram on the next slide illustrates this.

Empirical Distribution Depicted K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements Note: The diagram shows the samples occurring in increasing order but they can be in any order. 1 1 Source: http://www.aiaccess.net/english/glossaries/glosmod/e gm distribution function.htm

Glivenko-Cantelli Lemma K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements Theorem: As n, the empirical distribution ˆΦ( ) approaches the true distribution Φ( ). Specifically, if we define the Kolmogorov-Smirnov distance then d n 0 as n. d n = max ˆΦ(u) Φ(u), u At what rate does the convergence take place?

K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements One-Sample Kolmogorov-Smirnov Statistic Fix a confidence level δ > 0 (usually δ is taken as 0.05 or 0.02). Define the threshold ( 1 θ(n, δ) = 2n log 2 ) 1/2. δ Then with probability 1 δ, we can say that max u ˆΦ(u) Φ(u) =: d n θ n.

One-Sample Kolmogorov-Smirnov Test K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements Given samples x 1,..., x n, fit it with some distribution F ( ) (e.g. Gaussian). Compute the K-S statistic d n = max ˆΦ(u) F (u). u Compare d n with the threshold θ(n, δ). If d n > θ(n, δ), we reject the null hypothesis at level δ. In other words, if d n > θ(n, δ), then we are 1 δ sure that the data was not generated by the distribution F ( ).

Outline Hypothesis Testing 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

: Motivation The student t test is used the null hypothesis that two sets of samples have the same mean, assuming that they have the same variance. The test has broad applicability even if the assumption of same variance is not satisfied. Problem: We are given two samples x 1,..., x m1 and x m1 +1,..., x m1 +m 2. Determine whether the two sets of samples arise from a distribution with the same mean. Application: Most commonly used in quality control.

: Theory Let x 1, x 2 denote the means of the two sample classes, that is, x 1 = 1 m 1 x i, x 2 = 1 m 2 m 1 m 2 i=1 i=1 x m1 +i. Let S 1, S 2 denote the unbiased estimates of the standard deviations of the two samples, that is, S 2 1 = S 2 2 = 1 m 1 1 1 m 2 1 m 1 i=1 m 2 i=1 (x i x 1 ) 2, (x m1 +i x 2 ) 2.

: Theory 2 Now define the pooled standard deviation S 12 by Then the quantity S12 2 = (m 1 1)S1 2 + (m 2 1)S2 2. m 1 + m 2 2 d t = x 1 x 2 S 12 (1/m1 ) + (1/m 2 ) satisfies the t distribution with m 1 + m 2 2 degrees of freedom. As the number of d.o.f. becomes large, the t distribution approaches the normal distribution. The next slide shows the density of the t distribution for various d.o.f.

Density of the t Distribution

Outline Hypothesis Testing 1 Hypothesis Testing 2 3 K-S (Kolmogorov-Smirnov) Tests: Objectives Kolmogorov-Smirnov Tests: Statements 4 5

: Motivation The t test is to determine whether two samples have the same mean. The chi-squared test is to determine whether two samples have the same variance. The application is again to quality control.

: Theory Given two sets of samples, say x 1,..., x m1 and x m1 +1,..., x m1 +m 2 (where usually m 2 m 1 ), compute the unbiased variance estimate V 1 of the larger (first) sample V 1 = 1 m 1 1 m 1 i=1 (x i x 1 ) 2, and the sum of squares of the smaller (second) sample m 2 S 2 = (x m1 +i x 2 ) 2 = (m 2 1)V 2. i=1 Then the ratio S 2 /V 1 satisfies the chi-squared (or χ 2 ) distribution with m 2 1 degrees of freedom.

Distribution Function of the Chi-Squared Variable

Density Function of the Chi-Squared Variable

Application of the Note that the χ 2 r.v. is always nonnegative. So, given some confidence δ (usually δ = 0.05), we need to determine a confidence interval x l = Φ 1 χ 2,m 2 1 (δ), x u = Φ 1 χ 2,m 2 (1 δ). 1 If the test statistic S 2 /V 1 lies in the interval [x l, x u ], then we accept the null hypothesis that both samples have the same variance.