If there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,

Similar documents
10. Composite Hypothesis Testing. ECE 830, Spring 2014

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

Detection theory. H 0 : x[n] = w[n]

Detection and Estimation Theory

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

ECE531 Lecture 4b: Composite Hypothesis Testing

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Lecture 8: Signal Detection and Noise Assumption

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1

Hypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses

Stephen Scott.

Detection theory 101 ELEC-E5410 Signal Processing for Communications

ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Topic 3: Hypothesis Testing

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Detection and Estimation Theory

Topic 17: Simple Hypotheses

DETECTION theory deals primarily with techniques for

Topic 15: Simple Hypotheses

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters

STAT 830 Hypothesis Testing

BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

Lecture 7 Introduction to Statistical Decision Theory

Detection and Estimation Theory

STAT 830 Hypothesis Testing

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Introduction: exponential family, conjugacy, and sufficiency (9/2/13)

Lecture 12 November 3

Topic 17 Simple Hypotheses

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Composite Hypotheses and Generalized Likelihood Ratio Tests

Lecture 8: Information Theory and Statistics

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

hypothesis testing 1

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Bayesian Decision Theory

Chapter 4. Theory of Tests. 4.1 Introduction

F79SM STATISTICAL METHODS

PATTERN RECOGNITION AND MACHINE LEARNING

Introduction to Statistical Inference

STAT 801: Mathematical Statistics. Hypothesis Testing

Information Theory and Hypothesis Testing

Chapter 2 Signal Processing at Receivers: Detection Theory

Lecture 8: Information Theory and Statistics

Topic 19 Extensions on the Likelihood Ratio

ECE531 Lecture 2b: Bayesian Hypothesis Testing

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

FYST17 Lecture 8 Statistics and hypothesis testing. Thanks to T. Petersen, S. Maschiocci, G. Cowan, L. Lyons

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

Hypothesis testing (cont d)

Midterm Exam 1 Solution

simple if it completely specifies the density of x

Chapter 7. Hypothesis Testing

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Primer on statistics:

Statistics: Learning models from data

Signal Detection Basics - CFAR

Chapter 3 Single Random Variables and Probability Distributions (Part 1)

Introduction to Detection Theory

Physics 403 Probability Distributions II: More Properties of PDFs and PMFs

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Summary of Chapters 7-9

Statistics for the LHC Lecture 1: Introduction

Decision Criteria 23

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

ECE531 Screencast 11.5: Uniformly Most Powerful Decision Rules

Hypothesis Testing - Frequentist

14.30 Introduction to Statistical Methods in Economics Spring 2009

Detection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010

Lecture 1: August 28

Mathematical statistics

Statistical Data Analysis Stat 3: p-values, parameter estimation

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

Spring 2012 Math 541B Exam 1

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Physics 403. Segev BenZvi. Classical Hypothesis Testing: The Likelihood Ratio Test. Department of Physics and Astronomy University of Rochester

8 Testing of Hypotheses and Confidence Regions

Hypothesis Testing: The Generalized Likelihood Ratio Test

Week 1 Quantitative Analysis of Financial Markets Distributions A

STOCHASTIC PROCESSES, DETECTION AND ESTIMATION Course Notes

Method of Feldman and Cousins for the construction of classical confidence belts

Hypothesis Testing Chap 10p460

Importance Sampling and. Radon-Nikodym Derivatives. Steven R. Dunbar. Sampling with respect to 2 distributions. Rare Event Simulation

ECE531 Screencast 11.4: Composite Neyman-Pearson Hypothesis Testing

Generalized Linear Models

7 Gaussian Discriminant Analysis (including QDA and LDA)

Results On Gas Detection and Concentration Estimation via Mid-IR-based Gas Detection System Analysis Model

Transcription:

Recall The Neyman-Pearson Lemma Neyman-Pearson Lemma: Let Θ = {θ 0, θ }, and let F θ0 (x) be the cdf of the random vector X under hypothesis and F θ (x) be its cdf under hypothesis. Assume that the cdfs F θi (x) have corresponding pdfs or pmfs f θi (x), i =0,. Then a test of the form φ(x) =, for f θ (x) kf θ0 (x), γ, for f θ (x) =kf θ0 (x), 0, for f θ (x) kf θ0 (x), for some k 0 and some 0 γ is the most powerful test of size α for testing hypothesis : θ = θ 0 versus : θ = θ. Choosing the Threshold for the Neyman-Pearson Test To choose a threshold k and parameter γ to produce a N-P test of the form, for f θ (x) kf θ0 (x), φ(x) = γ, for f θ (x) =kf θ0 (x), 0, for f θ (x) kf θ0 (x), with the desired size α, we note that P θ0 ({f θ (X) kf θ0 (X)}) α =E θ0 [φ(x)] = P θ0 ({f θ (X) kf θ0 (X)}) + γp θ0 ({f θ (X) =kf θ0 (X)}) = P θ0 ({f θ (X) kf θ0 (X)}) + γp θ0 ({f θ (X) =kf θ0 (X)}).

α = P θ0 ({f θ (X) kf θ0 (X)}) + γp θ0 ({f θ (X) =kf θ0 (X)}). If there exists a threshold k 0 such that This term can be made equal to zero (e.g., γ =0) P θ0 ({f θ (X) k 0 f θ0 (X)}) = α then we can take k = k 0 γ =0 and achieve a test of size α. P θ0 ({f θ (X) k 0 f θ0 (X)}) α P θ0 ({f θ (X) k 0 f θ0 (X)}) Inclusion of k 0 results in strict bracketing of α This can only occur when P θ0 ({f θ (X) =k 0 f θ0 (X)}) 0. In this case, we select k = k 0 such that the bracketing occurs and then solve for γ to achieve a size α test. The resulting value of γ is γ = P θ 0 ({f θ (X) k 0 f θ0 (X)}) ( α). P θ0 ({f θ (X) =k 0 f θ0 (X)})

The Likelihood Ratio Test We can rewrite the Neyman-Pearson decision rule in terms of the Likelihood Ratio L(x) = f θ (x) f θ0 (x). The Neyman-Pearson test can be rewritten as 8, for L(x) k, (x) =, for for L(x) = k, : 0, for L(x) k. (x) = 8, for L(x) k,, for for L(x) = k, : 0, for L(x) k. If there is a k 0 such that take k = k 0. P θ0 ({L(X) k 0 (X)}) = α If not, then find a k 0 such that P θ0 ({L(X) k 0 }) α P θ0 ({L(X) k 0 )}) and take k = k 0 and γ = P θ 0 ({L(X) k 0 }) ( α) P θ0 ({L(X) =k 0 })

Because L(X) is a function of a random vector X, it is itself a scalar random variable, and it takes on only nonnegative values. If P θ0 ({L(X) =k}) = 0, then the threshold k achieving false alarm probability α can be found by solving α = P θ0 ({L(X) k}) = for k, where f L,θ0 (l) is the density function of L(X) under k f L,θ0 (l) dl, We will find it convenient to use the log-likelihood ratio l(x) = log (L(X)). Because log( ) is a monotonically increasing function on (0, ), the most powerful test of size α equivalent to the likelihood ratio test will take the form φ(x) = where the threshold l 0 = log k., for l(x) l 0, γ, for l(x) =l 0, 0, for l(x) l 0, Working with l(x) often yields simpler results than L(X). Notation: L(X) k or l(x) l 0,

Example Let X be a Gaussian random variable. Under : X N [0, σ 2 ] Under : X N [µ, σ 2 ] Find the most powerful test of size α, and determine an expression for the power β as a function of α, µ, and σ. The likelihood ratio is L(X) = f (X) f 0 (X) = { } 2πσ exp (X µ) 2 2σ 2 2πσ exp { } X 2 2σ 2 2Xµ µ 2 µx =exp =exp 2 2 2 µ 2 2 2 k The log-likelihood ratio is l(x) = 2µX µ2 2σ 2 l 0 which simplifies to X σ2 µ ( µ 2 2σ 2 + l 0 ) = µ 2 + σ2 l 0 µ So our test is of the form {, for Xλ, φ(x) = 0, for X λ, where λ = µ 2 + σ2 l 0 µ

φ(x) = {, for Xλ, 0, for X λ, X N [0, σ 2 ] under α =E H0 [φ(x)] = λ f 0 (x) dx = Φ Solving for λ achieving a size α test: λ α = σφ ( α) ( ) λ σ X N [µ, σ 2 ] under β =E [φ(x)] = f (x) dx = λ α µ = ( ) µ. Example 2 The photon count N observed by a laser radar is a Poisson random variable: Under : N Poisson(λ 0 ) Under : N Poisson(λ ) We assume λ λ 0. We have probability mass functions (pmfs): p 0 (n) = λn 0 e λ 0 n! p (n) = λn e λ n!, n =0,, 2,...,, n =0,, 2,...,

The likelihood ratio is L(N) = p (N) p 0 (N) = λn e λ /N! λ N 0 e λ 0 /N! = ( λ λ 0 ) N e (λ λ 0 ) k The log-likelihood ratio is l(n) = ln L(N) =N ln ( λ λ 0 ) (λ λ 0 ) Hence we can express the test in the form l 0 = ln k. N l 0 +(λ λ 0 ) ln (λ /λ 0 ) 0 = η. N l 0 +(λ λ 0 ) ln (λ /λ 0 ) 0 = η. The most powerful test of size α is φ(n) = {, for Nη, γ, for N = η, 0, for Nη,

The power β of this size α test is β =E [φ(n)] = P ({N η α })+γ α P ({N = η α }) = P ({N η α })+γ α P ({N = η α }) = η α n=0 λ n e λ n! + γ α λ η α e λ η α! The Receiver Operating Characteristic (ROC) The performance of a binary test is characterized by the pair (α, β) For a likelihood ratio test, we achieve different pairs (α(k), β(k)) for different threshold values k The locus of points {(α(k), β(k)); k (0, )} specifies all achievable (α, β) that can be obtained by varying the threshold k. Such a curve is called a Receiver Operating Characteristic (ROC)

A typical ROC appears as follows: β = P D ROC Chance Line 0 α = P FA β = P D Properties of the ROCs of Likelihood Ratio Tests ROC Chance Line. All continuous likelihood ratio tests have ROCs that are convex downward. 0 α = P FA 2. Points on the chance line (β = α) can be achieved without observing any data by picking hypothesis at random with probability α. (i.e., flipping a biased coin with probability α of coming up heads ).

β = P D ROC Chance Line Properties of the ROCs of Likelihood Ratio Tests (Continued) 0 α = P FA 3. All continuous likelihood ratio tests have ROC s that are above the chance line. This is a consequence of property, because the points (α, β) =(0, 0) and (α, β) =(, ) are contained on all ROC s. 4. The slope of the ROC at any particular point is given by the threshold achieving that operating point, i.e., dβ dα = dβ/dk dα/dk = f L,θ (l) f L,θ0 (l) = k 0