Detection Theory. Composite tests

Similar documents
Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Estimation Theory Fredrik Rusek. Chapters 6-7

Detection theory 101 ELEC-E5410 Signal Processing for Communications

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Chapter 9: Hypothesis Testing Sections

Hypothesis Testing - Frequentist

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Statistical Data Analysis Stat 3: p-values, parameter estimation

Estimation and Detection

Lecture 8: Information Theory and Statistics

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1

Detection theory. H 0 : x[n] = w[n]

Estimation Theory Fredrik Rusek. Chapters

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Fundamentals of Statistical Signal Processing Volume II Detection Theory

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

Estimation Theory Fredrik Rusek. Chapter 11

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

EECS564 Estimation, Filtering, and Detection Exam 2 Week of April 20, 2015

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

ECE531 Lecture 10b: Maximum Likelihood Estimation

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

DETECTION theory deals primarily with techniques for

Hypothesis Testing: The Generalized Likelihood Ratio Test

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Statistics for the LHC Lecture 1: Introduction

ORF 245 Fundamentals of Statistics Chapter 9 Hypothesis Testing

Lecture 7 Introduction to Statistical Decision Theory

Composite Hypotheses and Generalized Likelihood Ratio Tests

STAT 730 Chapter 4: Estimation

Primer on statistics:

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

STA414/2104 Statistical Methods for Machine Learning II

Some General Types of Tests

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Detection and Estimation Chapter 1. Hypothesis Testing

ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses

Lecture 21. Hypothesis Testing II

Spring 2012 Math 541B Exam 1

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

Mathematical statistics

STAT 830 Hypothesis Testing

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

Mathematical Statistics

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Choosing among models

ECE531 Lecture 8: Non-Random Parameter Estimation

STA 732: Inference. Notes 2. Neyman-Pearsonian Classical Hypothesis Testing B&D 4

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

EIE6207: Estimation Theory

PATTERN RECOGNITION AND MACHINE LEARNING

Methods of evaluating tests

STAT 830 Hypothesis Testing

8: Hypothesis Testing

1. (Regular) Exponential Family

Lecture 2. G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Mathematical statistics

Lecture 8: Information Theory and Statistics

The binomial model. Assume a uniform prior distribution on p(θ). Write the pdf for this distribution.

Ch. 5 Hypothesis Testing

Recall the Basics of Hypothesis Testing

Statistical Methods for Particle Physics Lecture 2: statistical tests, multivariate methods

Statistical Methods for Particle Physics (I)

simple if it completely specifies the density of x

Module 2. Random Processes. Version 2, ECE IIT, Kharagpur

Mathematical statistics

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Review. December 4 th, Review

UNIFORMLY MOST POWERFUL CYCLIC PERMUTATION INVARIANT DETECTION FOR DISCRETE-TIME SIGNALS

1. Fisher Information

Theory of Statistical Tests

Testing Statistical Hypotheses

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Statistical Methods for Particle Physics Lecture 4: discovery, exclusion limits

STONY BROOK UNIVERSITY. CEAS Technical Report 829

Rowan University Department of Electrical and Computer Engineering

Parametric Techniques Lecture 3

A Few Notes on Fisher Information (WIP)

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Machine Learning Basics: Maximum Likelihood Estimation

Derivation of Monotone Likelihood Ratio Using Two Sided Uniformly Normal Distribution Techniques

ECON 4160, Autumn term Lecture 1

SUFFICIENT STATISTICS

Parametric Techniques

CONSIDER two companion problems: Separating Function Estimation Tests: A New Perspective on Binary Composite Hypothesis Testing

The loss function and estimating equations

GAUSSIAN PROCESS REGRESSION

Basic concepts in estimation

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Chapter 8 Hypothesis Testing

Frequentist-Bayesian Model Comparisons: A Simple Example

14.30 Introduction to Statistical Methods in Economics Spring 2009

Transcription:

Composite tests

Chapter 5: Correction Thu I claimed that the above, which is the most general case, was captured by the below Thu

Chapter 5: Correction Thu I claimed that the above, which is the most general case, was captured by the below Argument was Thu Thu

Chapter 5: Correction Thu I claimed that the above, which is the most general case, was captured by the below This is not correct, since it is limited to the case that C 2 -C 1 is positive semi-definite Slides have been corrected Thu Thu

Chapter 6: UMP - Uniformly most powerful tests Thu Consider the case when the value of A is unknown, but assume A>0

Chapter 6: UMP - Uniformly most powerful tests Thu Consider the case when the value of A is unknown, but assume A>0 UMP: An optimal test no matter the Thu value of A similar concept to MVU

Chapter 6: UMP - Uniformly most powerful tests Thu Consider the case when the value of A is unknown, but assume A>0 UMP: An optimal test no matter the Thu value of A similar concept to MVU Strategy to get UMPs: 1. Design test as if A is known 2. Show that test does not need knowledge of the value A

Chapter 6: UMP - Uniformly most powerful tests Thu Step 1: Design test as if A is known

Chapter 6: UMP - Uniformly most powerful tests Step 1: Design test as if A is known

Chapter 6: UMP - Uniformly most powerful tests Step 1: Design test as if A is known Cancel multiplicative constants Remove exp by taking logarithm Cancel x 2 [n]

Chapter 6: UMP - Uniformly most powerful tests Step 1: Design test as if A is known Manipulate a bit.

Chapter 6: UMP - Uniformly most powerful tests Step 1: Design test as if A is known scale

Chapter 6: UMP - Uniformly most powerful tests Step 1: Design test as if A is known Test statistic is not dependent on A Threshold seems to be, but is not

Chapter 6: UMP - Uniformly most powerful tests Step 2: Show that test does not need knowledge of the value A

Chapter 6: UMP - Uniformly most powerful tests Step 2: Show that test does not need knowledge of the value A

Chapter 6: UMP - Uniformly most powerful tests Step 2: Show that test does not need knowledge of the value A

Chapter 6: UMP - Uniformly most powerful tests Step 2: Show that test does not need knowledge of the value A Threshold does not depend on P FA

Chapter 6: UMP - Uniformly most powerful tests Compute P D

Chapter 6: UMP - Uniformly most powerful tests Compute P D

Chapter 6: UMP - Uniformly most powerful tests Compute P D

Chapter 6: UMP - Uniformly most powerful tests Compute P D Performance depends on A

Chapter 6: UMP - Uniformly most powerful tests Recap A test is UMP if it, for all possible values of the unknown parameter(s), maximzes P D for given P FA

Chapter 6: One-sided vs. Two sided Consider now: A<0

Chapter 6: One-sided vs. Two sided Consider now: A<0 Same steps as before Step 1: Design test as if A is known

Chapter 6: One-sided vs. Two sided Consider now: A<0 Same steps as before Step 1: Design test as if A is known Next thing was to divide with A This changes inequality with A<0

Chapter 6: One-sided vs. Two sided Consider now: A<0 Same steps as before Step 1: Design test as if A is known Next thing was to divide with A This changes inequality with A<0 <

Chapter 6: One-sided vs. Two sided Consider now: A<0 <

Chapter 6: One-sided vs. Two sided Consider now: A<0 <

Chapter 6: One-sided vs. Two sided Consider now: A<0 <

Chapter 6: One-sided vs. Two sided Consider now: A<0 <

Chapter 6: One-sided vs. Two sided Consider now: A<0

Chapter 6: One-sided vs. Two sided This means problems, since test can not be implemented For A>0, decide H 1 if For A<0, decide H 1 if

Chapter 6: One-sided vs. Two sided This means problems, since test can not be implemented For A>0, decide H 1 if For A<0, decide H 1 if UMP exists (one sided) UMP does not exist (two sided)

Chapter 6: One-sided vs. Two sided This means problems, since test can not be implemented For A>0, decide H 1 if For A<0, decide H 1 if UMP exists (one sided) An educated guess would be to decide H 1 if UMP does not exist (two sided) This will turn out to be well motivated by the GLRT that comes shortly

Chapter 6: Karlin-Rubin Thm - A condition for UMP If the likelihood ratio is monotonic in the test T(x) and it is known that then Detect H 1, if T(x) > γ is UMP

Chapter 6: Karlin-Rubin Thm - A condition for UMP If the likelihood ratio is monotonic in the test T(x) and it is known that then Detect H 1, if T(x) > γ is UMP This follows directly from the Neyman-Pearson theorem

Chapter 6: Karlin-Rubin Thm - A condition for UMP Application: Exponential family

Chapter 6: Karlin-Rubin Thm - A condition for UMP Application: Exponential family Likelihood ratio 0

Chapter 6: Karlin-Rubin Thm - A condition for UMP Application: Exponential family Likelihood ratio: If p(θ) is increasing, then LLR is monotonic in 0

Chapter 6: Karlin-Rubin Thm - A condition for UMP Application: Exponential family Likelihood ratio: If p(θ) is increasing, then LLR is monotonic in 0 In our case (DC level), we have p(θ) = θ/σ 2

Chapter 6: Composite tesiting Bayesian approach With likelihoods containing unknown parameters, We can integrate away the unknown

Chapter 6: Composite tesiting Bayesian approach With likelihoods containing unknown parameters, We can integrate away the unknown A case that is very common and fully doable is x=hθ+w, with Gaussian matrix H

Chapter 6: Composite tesiting Bayesian approach With likelihoods containing unknown parameters, We can integrate away the unknown If prior is unknown, use a non-informative one (See Estimation theory book)

Chapter 6: GLRT Finite data records The Generalized Likelihood ratio test is heuristic for finite data records, but can be proven optimal asymptotically in the size of the data record A = πr 2 Where θ 1 is the MLE of θ under H 1, θ 0 is the MLE of θ under H 0

Chapter 6: GLRT Finite data records Example: Non-coherent detection A = πr 2

Chapter 6: GLRT Finite data records Example: Non-coherent detection A = πr 2 GLRT replaces H with its ML estimate

Chapter 6: GLRT Finite data records Example: Non-coherent detection A = πr 2

Chapter 6: GLRT Finite data records Example: Non-coherent detection A = πr 2

Chapter 6: GLRT Finite data records Example: A = πr 2 GRLT is A = πr 2

Chapter 6: GLRT Finite data records Example: A = πr 2 GRLT is A = πr 2 But from estimation theory, we have that the MLE of A is

Chapter 6: GLRT Finite data records Example: A = πr 2 GRLT is A = πr 2 But from estimation theory, we have that the MLE of A is Thus

Chapter 6: GLRT Finite data records Example: Taking logs, and simplification gives

Chapter 6: GLRT Finite data records Example: Taking logs, and simplification gives

Chapter 6: GLRT Finite data records Example: Taking logs, and simplification gives Thus,

Chapter 6: GLRT Large data records Large in this case does not mean that we use Szegö and the Fourier transform. In this case, we consider large N, but with independent measurements Two assumptions: 1. Signal is weak 2. MLE attains asymptotic form

Chapter 6: GLRT Large data records Large in this case does not mean that we use Szegö and the Fourier transform. In this case, we consider large N, but with independent measurements Two assumptions: 1. Signal is weak Means that A is not enormous. Reasonable, otherwise problem is simple 2. MLE attains asymptotic form

Chapter 6: GLRT Large data records Large in this case does not mean that we use Szegö and the Fourier transform. In this case, we consider large N, but with independent measurements Two assumptions: 1. Signal is weak Means that A is not enormous. Reasonable, otherwise problem is simple 2. MLE attains asymptotic form From Estimation theory

Chapter 6: GLRT Large data records Theorem Setup Differ for H 0 and H 1 Parameter vector to be detected Equal for H 0 and H 1 (e.g. noise variance)

Chapter 6: GLRT Large data records Theorem Setup Differ for H 0 and H 1 Parameter vector to be detected Equal for H 0 and H 1 (e.g. noise variance) Hypotheses to test for

Chapter 6: GLRT Large data records Theorem Setup Differ for H 0 and H 1 Parameter vector to be detected Equal for H 0 and H 1 (e.g. noise variance) Hypotheses to test for Definition of GLRT Note that MLEs of θ s differ under H 0 and H 1

Chapter 6: GLRT Large data records Theorem Statement A = πr 2

Chapter 6: GLRT Large data records Theorem Chi-2 variable, r DoF Statement A = πr 2

Chapter 6: GLRT Large data records Theorem Chi-2 variable, r DoF Statement Non-central Chi-2 variable, r DoF A = πr 2

Chapter 6: GLRT Large data records Theorem Chi-2 variable, r DoF Statement Non-central Chi-2 variable, r DoF True value of θ r under H 1 A = πr 2

Chapter 6: GLRT Large data records Theorem Chi-2 variable, r DoF Statement True value of θ s under H 1 / H 0 Non-central Chi-2 variable, r DoF True value of θ r under H 1 A = πr 2

Chapter 6: GLRT Large data records Theorem Chi-2 variable, r DoF Statement True value of θ s under H 1 / H 0 Non-central Chi-2 variable, r DoF True value of θ r under H 1 A = πr 2 Fisher Inform, Doesn t depend on H 1 or H 0

Chapter 6: GLRT Large data records Theorem Chi-2 variable, r DoF Statement True value of θ s under H 1 / H 0 True value of θ r under H 1 Fisher information matrix: one does not need to think about H 0 or H 1. Non-central Chi-2 variable, r DoF Think like this: Given x, what is the Fisher info for θ r,θ s A = πr 2 Fisher Inform, Doesn t depend on H 1 or H 0

Chapter 6: GLRT Large data records Theorem Statement Cancels with no nusiance parameters A = πr 2

Chapter 6: GLRT Large data records Theorem Statement A = πr 2 Since Fisher is pos. def., λ is degraded by nuisance Cancels with no nusiance parameters

Chapter 6: GLRT Large data records Theorem Larger λ separates the pdfs more, Thus better P D with larger λ Statement A = πr 2 Since Fisher is pos. def., λ is degraded by nuisance Cancels with no nusiance parameters

Chapter 6: GLRT Large data records Theorem Larger λ separates the pdfs more, Thus better P D with larger λ Statement A = πr 2 Since Fisher is pos. def., λ is degraded by nuisance Cancels with no nusiance parameters So, not surprisingly, nuisance degrades our detection capability

Chapter 6: GLRT Large data records Theorem Statement No nuisance A = πr Note: The test is still difficult, since 2 it is still given by and we need to find the MLEs