Detection and Estimation Theory

Similar documents
Detection and Estimation Theory

Detection and Estimation Theory

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Detection theory 101 ELEC-E5410 Signal Processing for Communications

Lecture 5: Likelihood ratio tests, Neyman-Pearson detectors, ROC curves, and sufficient statistics. 1 Executive summary

If there exists a threshold k 0 such that. then we can take k = k 0 γ =0 and achieve a test of size α. c 2004 by Mark R. Bell,

Detection Theory. Chapter 3. Statistical Decision Theory I. Isael Diaz Oct 26th 2010

2. What are the tradeoffs among different measures of error (e.g. probability of false alarm, probability of miss, etc.)?

Introduction to Statistical Inference

Detection and Estimation Theory

Lecture 8: Information Theory and Statistics

Detection and Estimation Theory

ECE531 Lecture 4b: Composite Hypothesis Testing

F2E5216/TS1002 Adaptive Filtering and Change Detection. Course Organization. Lecture plan. The Books. Lecture 1

Introduction to Detection Theory

Detection and Estimation Theory

Lecture 8: Information Theory and Statistics

ECE531 Lecture 6: Detection of Discrete-Time Signals with Random Parameters

Lecture 7 Introduction to Statistical Decision Theory

Detection theory. H 0 : x[n] = w[n]

University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries

ECE531 Lecture 2b: Bayesian Hypothesis Testing

Chapter 2 Signal Processing at Receivers: Detection Theory

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Digital Transmission Methods S

Decision Criteria 23

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Hypothesis testing (cont d)

Signal Detection Basics - CFAR

Detection and Estimation Chapter 1. Hypothesis Testing

Introduction to Signal Detection and Classification. Phani Chavali

An introduction to Bayesian inference and model comparison J. Daunizeau

Bayesian inference. Justin Chumbley ETH and UZH. (Thanks to Jean Denizeau for slides)

Bayesian inference J. Daunizeau

Performance Evaluation and Comparison

On the Optimality of Likelihood Ratio Test for Prospect Theory Based Binary Hypothesis Testing

Stephen Scott.

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Detection and Estimation Theory

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

STOCHASTIC PROCESSES, DETECTION AND ESTIMATION Course Notes

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

ECE531 Screencast 9.2: N-P Detection with an Infinite Number of Possible Observations

DETECTION theory deals primarily with techniques for

Hypothesis Testing. Testing Hypotheses MIT Dr. Kempthorne. Spring MIT Testing Hypotheses

ON UNIFORMLY MOST POWERFUL DECENTRALIZED DETECTION

ECE531 Lecture 13: Sequential Detection of Discrete-Time Signals

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

Bayes Rule for Minimizing Risk

JOINT DETECTION-STATE ESTIMATION AND SECURE SIGNAL PROCESSING

Machine Learning Linear Classification. Prof. Matteo Matteucci

University of Siena. Multimedia Security. Watermark extraction. Mauro Barni University of Siena. M. Barni, University of Siena

IN HYPOTHESIS testing problems, a decision-maker aims

Bayesian inference J. Daunizeau

بسم الله الرحمن الرحيم

The Naïve Bayes Classifier. Machine Learning Fall 2017

Lecture 22: Error exponents in hypothesis testing, GLRT

Fundamentals of Statistical Signal Processing Volume II Detection Theory

10. Composite Hypothesis Testing. ECE 830, Spring 2014

ECE531 Screencast 11.4: Composite Neyman-Pearson Hypothesis Testing

ESE 524 Detection and Estimation Theory

PATTERN RECOGNITION AND MACHINE LEARNING

44 CHAPTER 2. BAYESIAN DECISION THEORY

EECS564 Estimation, Filtering, and Detection Exam 2 Week of April 20, 2015

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Mathematical Statistics

Topic 3: Hypothesis Testing

ECE531 Lecture 2a: A Mathematical Model for Hypothesis Testing (Finite Number of Possible Observations)

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

ROBUSTNESS ANALYSIS OF THE MATCHED FILTER DETECTOR THROUGH UTILIZING SINUSOIDAL WAVE SAMPLING. A Thesis JEROEN STEDEHOUDER

hypothesis testing 1

Parameter estimation Conditional risk

Novel spectrum sensing schemes for Cognitive Radio Networks

Lecture 12 November 3

Announcements. Proposals graded

Lecture 8: Signal Detection and Noise Assumption

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

BAYESIAN DECISION THEORY

Parameter Estimation, Sampling Distributions & Hypothesis Testing

Statistics for Particle Physics. Kyle Cranmer. New York University. Kyle Cranmer (NYU) CERN Academic Training, Feb 2-5, 2009

Intelligent Systems Statistical Machine Learning

1.1 Basis of Statistical Decision Theory

p(x ω i 0.4 ω 2 ω

Midterm, Fall 2003

Lecture 2: Basic Concepts of Statistical Decision Theory

APPENDIX 1 NEYMAN PEARSON CRITERIA

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Pattern Classification

Information Theory and Hypothesis Testing

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

Partitioning the Parameter Space. Topic 18 Composite Hypotheses

Intelligent Systems Statistical Machine Learning

AUTOMOTIVE ENVIRONMENT SENSORS

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Summary: the confidence interval for the mean (σ 2 known) with gaussian assumption

Concerns of the Psychophysicist. Three methods for measuring perception. Yes/no method of constant stimuli. Detection / discrimination.

Composite Hypotheses and Generalized Likelihood Ratio Tests

Statistical Signal Processing Detection, Estimation, and Time Series Analysis

Transcription:

ESE 524 Detection and Estimation Theory Joseph A. O Sullivan Samuel C. Sachs Professor Electronic Systems and Signals Research Laboratory Electrical and Systems Engineering Washington University 2 Urbauer all 34-935-473 (Lynda answers) jao@wustl.edu J. A. O'S. ESE 524, Lecture 3, /2/9

Other stuff Website Problem Set Fridays occasionally? Probably I will miss Feb., 2, 9 Possible make-up days on Feb. 6, 3, 2 J. A. O'S. ESE 524, Lecture 3, /2/9 2

Finite Dimensional Binary Detection Theory Problem: Given a measurement of a random vector r, decide which h of two models it is drawn from. Source System Decision or r or Source produces or probabilities P and P =-P. Probabilities may not be known. Random measurement vector results. Transition probabilities (system) p r ( R ) are often i i assumed known. Decision rule is a partition of measurement space into two sets Z and Z corresponding to decisions and. J. A. O'S. ESE 524, Lecture 3, /2/9 3

Bayes Criterion Assumptions: Source prior probabilities P and P are known Transition densities are known Relative costs of decision outcomes are known: C ij cost ij of deciding i and j is true C -C > is the relative cost of a miss C -C > is the relative cost of a false alarm Bayes decision criterion: Pick the decision rule to minimize the average risk (cost) R = CP ( PF ) + CP ( PD ) + CPP F + CPP D = ( C C) P ( PD) + ( C C ) PP F + CP + CP = C P ( P ) + C PP + C P + C P Simplified lifidnotation tti M D F F J. A. O'S. ESE 524, Lecture 3, /2/9 4

Results from last lecture R R Theorem: Likelihood ratio test is optimal Corrolary: The log-likelihood ratio test is optimal For proofs, see last lecture (summary below) and text = C P P + C P P M M F F The measurement space is divided into two regions: Decide or Integrals of pdfs determine the probabilities of error Integrate over the other or wrong region P = p ( R ) dr P = p ( R ) dr M r F r Z Z = C P p ( R ) dr+ C P p ( R ) dr C M r F r Z Z P p ( ) > r R C P pr ( R ) < M F J. A. O'S. ESE 524, Lecture 3, /2/9 5

Decision i Rules Criterion Bayes: mimize expected risk or cost Transition Priors Costs Densities P &P C ij Yes Yes Yes Minimum Probability of Error Yes Yes No Minimax Yes No Yes Neyman-Pearson Yes No No R = C P P + C P P The entire space is divided into two regions: M M F F PM = p d Z r ( R ) R P = pr ( R ) d R F Z Decide or The integrals of the probability density functions determine the probabilities of error Integrate over the other or wrong region J. A. O'S. ESE 524, Lecture 3, /2/9 6

Bayes Criterion: Derivation of Likelihood Ratio Test C Λ l Compare C P p ( R ) and M r C P p F r There are many equivalent tests including the likelihood ratio and the log-likelihood ratio tests (LRT and LLRT) Pp ( R ) > C Pp ( R ) < M r F r ( R) ( R) pr ( R ) C P > F p ( R ) < C P η r M p ( ) r R CF P ln > ln γ p ( ) < r R CMP CF P ( C C ) P η = = C P ( C C ) P M ( R ) The likelihood ratio is a nonnegative-valued function of the observed data. Evaluated at a random realization of the data, the likelihood ratio is a random variable. Comparing the likelihood ratio to a threshold is the optimal test. The probabilities of error can be computed in terms of the probability density functions for either the likelihood ratio or the log-likelihood ratio. J. A. O'S. ESE 524, Lecture 3, /2/9 7

Optimal Decision i Rules Likelihood Ratio Test Threshold for Bayes Rule: CF P ( C C) P η = = CM P ( C C ) P Computations of the error probabilities p p ( ) ( r R > ) r R > Λ ( R) = η l ( R) = ln pr ( R pr ( R ) ) < < η P = p ( Λ ) d Λ M F p( ( ) η P = p Λ dλ γ P = p ( L ) dl M l P = p ( L ) dl F γ l ln η J. A. O'S. ESE 524, Lecture 3, /2/9 8

Minimum i Probability bilit of Errorr Minimum Probability of Error P = P P + P P CM = CF = P η = P e M F Loglikelihood Ratio Test J. A. O'S. ESE 524, Lecture 3, /2/9 9

Minimum i Probability bilit of Errorr Alternative View of Decision Rule: Compute Posterior Probabilities on ypotheses Select Most Likely ypothesis p ( ) r R > P p ( R ) < P r pr ( R ) P > pr ( R ) P g( R) < g( R) ( R P ) > r ( R ) p P r p P p( R) < p( R) p ( R ) = p ( R ) P + p ( R ) P r r < > P ( R) P ( R) for any g( R) > J. A. O'S. ESE 524, Lecture 3, /2/9

Gaussian Example Signal plus noise or just noise : r = w, n,2, N Assume costs and prior probabilities are known Find the optimal decision rule w n n =, n : r n = s i.i.d. N n + w n, 2 (, σ ), w n =,2,, N n s k n, k J. A. O'S. ESE 524, Lecture 3, /2/9

Gaussian Example Signal plus noise or just noise : r = w, n,2, N Assume costs and prior probabilities are known Find the optimal decision rule The optimal decision rule is equivalent to w n n =, n : r n = s i.i.d. N n + w n, 2 (, σ ), w n =,2,, N n s k n, k Compute the correlation between the signal s n and the data R n Compare it to a threshold The sufficient statistic is also called (for this case) a matched filter Performance is determined d by the signal-to-noise ratio (SNR), the ratio of the signal energy to the noise power J. A. O'S. ESE 524, Lecture 3, /2/9 2

(Somewhat) Practical Example Given an image, find the parts of the image that are different. Model: Gaussian data under either hypothesis. Under, variance is greater than under. Example: Image data. Background represents the null hypothesis model as Gaussian J. A. O'S. ESE 524, Lecture 3, /2/9 3

(Somewhat) Practical Example 8 istogram of Image 6 4 2 8 6 4 2 ( x ) 2 ij μ normsq = 2 2 σ { 3 I, normsq> ij = γ, otherwise 2 6 x 4 istogram of NormSq 5 4 5 2 25 3 35 4 45 5 55 6 5 J. A. O'S. ESE 5 524, 2 Lecture 25 3, 3/2/9 35 4

MtlbCd Matlab Code threshold=3; im=imread('passengersudsonplaneap.jpg','jpg'); im=sum(double(im),3); figure; imagesc(im); colormap gray; axis off [s,s2]=size(im); x=im(:2,:8); size(x) x=reshape(x,,2*8); [hx,ix]=hist(x,5); figure, plot(ix,hx); title('istogram of Image') mu=mean(x); sigma=std(x); normimage=(im-mu).^2/(2*sigma^2); normimage2=reshape(normimage,,numel(normimage)); [hx,ix]=hist(normimage2,5); figure, plot(ix,hx); title('istogram of NormSq') [xi,indexx]=find(normimage2>threshold); indexx]=find(normimage2>threshold); im2=reshape(im,,numel(im)); imagethresh=mu*ones(size(im2)); imagethresh(indexx)=im2(indexx); imagethresh=reshape(imagethresh,s,s2); figure; imagesc(imagethresh); colormap gray; axis off J. A. O'S. ESE 524, Lecture 3, /2/9 5

Minimax i Decision i Rule Find the decision rule that minimizes the maximum Bayes Risk over all possible priors. Criterion Bayes: mimize expected risk or cost min maxr R Z P Transition Priors Costs Densities P & P C ij Yes Yes Yes Minimum Probability of Error Yes Yes No Minimax Yes No Yes Neyman-Pearson Yes No No J. A. O'S. ESE 524, Lecture 3, /2/9 6

Minimax i Decision i Rule: Analysis For any fixed decision rule the risk is linear in P The maximum over P is achieved at an end point To make that end point as low as possible, the risk should be constant with respect to P To minimize that constant value, the risk should achieve the minimum risk at some P *. At that value of the prior, the best decision rule is a likelihood ratio test R M = C P P + C P P M M F F r ( R ) R Z P = p d P = pr ( R ) d R F Z J. A. O'S. ESE 524, Lecture 3, /2/9 7

Minimax i Decision i Rule: Analysis Bayes risk is concave (it is always below its tangent) Minimax is achieved at an end point or at an interior point on the Bayes risk curve where the tangent is zero 3 25 2 Risk 5 5..2.3.4.5.6.7.8.9 P J. A. O'S. ESE 524, Lecture 3, /2/9 8

.9.8 Minimax Decision Rule: 5 X.7 : x 5 e, X.6 Example X Matlab Code pf=:.:; pd=pf.^.2; eta=.2*(pf.^(-.8)); % eta=dp_d/dp_f figure plot(pf,pd);xlabel('p_f');ylabel('p_d') pd);xlabel('p cm=;cf=; pstar=./(+cm*eta/cf); riskoptimal=cm*(-pd).*pstar+cf*pf.*(-pstar); figure plot(pstar,riskoptimal,'b'), hold on p=:.:; r=cm*(-pd())*p+cf*pf()*(-p); plot(p,r,'r'), hold on r2=cm*(-pd(2))*p+cf*pf(2)*(-p); p+cf pf(2) (-p); plot(p,r2,'g'), hold on r3=cm*(-pd(3))*p+cf*pf(3)*(-p); plot(p,r3,'c') xlabel('p_');ylabel('risk') : x e, X l = 4X ln5 P = 5e dx = e F γ ' γ ' 5X 5 γ ' X P = e dx = e M P = P = P D.2.2 M( ) = F F F M C P C P F γ ' P D Risk.5.4.3.2...2.3.4.5.6.7.8.9 P F 3 25 2 5 5..2.3.4.5.6.7.8.9 P J. A. O'S. ESE 524, Lecture 3, /2/9 9

Neyman-Pearson Decision i Rule M ( α) F = P + η P F Minimize P M subject to P F α Variational approach Upper bound usually achieved Likelihood ratio test; threshold? = p ( ) d + η p ( ) d α r R R r R R Z Z Criterion Bayes: mimize expected risk or cost Transition Densities Priors P & P Costs C ij Yes Yes Yes Minimum Probability of Error Yes Yes No Minimax Yes No Yes Neyman-Pearson Yes No No J. A. O'S. ESE 524, Lecture 3, /2/9 2

Neyman-Pearson Decision i Rule M ( α) F = P + η P F Minimize P M subject to P F α Variational approach Upper bound usually achieved Likelihood ratio test; threshold? = p ( ) d + η p ( ) d α r R R r R R Z Z Plot the ROC: P D versus P F for the family of likelihood ratio tests Draw a vertical line where P F = α Find the corresponding P D At that point, the threshold equals the derivative of the ROC η = = dp dp dp dp D F D F dη dη J. A. O'S. ESE 524, Lecture 3, /2/9 2

Summary Several decision rules Likelihood (and loglikelihood) ratio test is optimal Receiver operating characteristic (ROC) plots probability of detection versus probability of false alarm with the threshold as a parameter all possible optimal performance Neyman-Pearson is a point on the ROC (P F = α) Minimax is a point on the ROC (P F C F =P M C M ) Probability of error is a point on the ROC (slope η = (-P )/P ) J. A. O'S. ESE 524, Lecture 3, /2/9 22