ESE 524 Detection and Estimation Theory Joseph A. O Sullivan Samuel C. Sachs Professor Electronic Systems and Signals Research Laboratory Electrical and Systems Engineering Washington University 2 Urbauer all 34-935-473 (Lynda answers) jao@wustl.edu J. A. O'S. ESE 524, Lecture 3, /2/9
Other stuff Website Problem Set Fridays occasionally? Probably I will miss Feb., 2, 9 Possible make-up days on Feb. 6, 3, 2 J. A. O'S. ESE 524, Lecture 3, /2/9 2
Finite Dimensional Binary Detection Theory Problem: Given a measurement of a random vector r, decide which h of two models it is drawn from. Source System Decision or r or Source produces or probabilities P and P =-P. Probabilities may not be known. Random measurement vector results. Transition probabilities (system) p r ( R ) are often i i assumed known. Decision rule is a partition of measurement space into two sets Z and Z corresponding to decisions and. J. A. O'S. ESE 524, Lecture 3, /2/9 3
Bayes Criterion Assumptions: Source prior probabilities P and P are known Transition densities are known Relative costs of decision outcomes are known: C ij cost ij of deciding i and j is true C -C > is the relative cost of a miss C -C > is the relative cost of a false alarm Bayes decision criterion: Pick the decision rule to minimize the average risk (cost) R = CP ( PF ) + CP ( PD ) + CPP F + CPP D = ( C C) P ( PD) + ( C C ) PP F + CP + CP = C P ( P ) + C PP + C P + C P Simplified lifidnotation tti M D F F J. A. O'S. ESE 524, Lecture 3, /2/9 4
Results from last lecture R R Theorem: Likelihood ratio test is optimal Corrolary: The log-likelihood ratio test is optimal For proofs, see last lecture (summary below) and text = C P P + C P P M M F F The measurement space is divided into two regions: Decide or Integrals of pdfs determine the probabilities of error Integrate over the other or wrong region P = p ( R ) dr P = p ( R ) dr M r F r Z Z = C P p ( R ) dr+ C P p ( R ) dr C M r F r Z Z P p ( ) > r R C P pr ( R ) < M F J. A. O'S. ESE 524, Lecture 3, /2/9 5
Decision i Rules Criterion Bayes: mimize expected risk or cost Transition Priors Costs Densities P &P C ij Yes Yes Yes Minimum Probability of Error Yes Yes No Minimax Yes No Yes Neyman-Pearson Yes No No R = C P P + C P P The entire space is divided into two regions: M M F F PM = p d Z r ( R ) R P = pr ( R ) d R F Z Decide or The integrals of the probability density functions determine the probabilities of error Integrate over the other or wrong region J. A. O'S. ESE 524, Lecture 3, /2/9 6
Bayes Criterion: Derivation of Likelihood Ratio Test C Λ l Compare C P p ( R ) and M r C P p F r There are many equivalent tests including the likelihood ratio and the log-likelihood ratio tests (LRT and LLRT) Pp ( R ) > C Pp ( R ) < M r F r ( R) ( R) pr ( R ) C P > F p ( R ) < C P η r M p ( ) r R CF P ln > ln γ p ( ) < r R CMP CF P ( C C ) P η = = C P ( C C ) P M ( R ) The likelihood ratio is a nonnegative-valued function of the observed data. Evaluated at a random realization of the data, the likelihood ratio is a random variable. Comparing the likelihood ratio to a threshold is the optimal test. The probabilities of error can be computed in terms of the probability density functions for either the likelihood ratio or the log-likelihood ratio. J. A. O'S. ESE 524, Lecture 3, /2/9 7
Optimal Decision i Rules Likelihood Ratio Test Threshold for Bayes Rule: CF P ( C C) P η = = CM P ( C C ) P Computations of the error probabilities p p ( ) ( r R > ) r R > Λ ( R) = η l ( R) = ln pr ( R pr ( R ) ) < < η P = p ( Λ ) d Λ M F p( ( ) η P = p Λ dλ γ P = p ( L ) dl M l P = p ( L ) dl F γ l ln η J. A. O'S. ESE 524, Lecture 3, /2/9 8
Minimum i Probability bilit of Errorr Minimum Probability of Error P = P P + P P CM = CF = P η = P e M F Loglikelihood Ratio Test J. A. O'S. ESE 524, Lecture 3, /2/9 9
Minimum i Probability bilit of Errorr Alternative View of Decision Rule: Compute Posterior Probabilities on ypotheses Select Most Likely ypothesis p ( ) r R > P p ( R ) < P r pr ( R ) P > pr ( R ) P g( R) < g( R) ( R P ) > r ( R ) p P r p P p( R) < p( R) p ( R ) = p ( R ) P + p ( R ) P r r < > P ( R) P ( R) for any g( R) > J. A. O'S. ESE 524, Lecture 3, /2/9
Gaussian Example Signal plus noise or just noise : r = w, n,2, N Assume costs and prior probabilities are known Find the optimal decision rule w n n =, n : r n = s i.i.d. N n + w n, 2 (, σ ), w n =,2,, N n s k n, k J. A. O'S. ESE 524, Lecture 3, /2/9
Gaussian Example Signal plus noise or just noise : r = w, n,2, N Assume costs and prior probabilities are known Find the optimal decision rule The optimal decision rule is equivalent to w n n =, n : r n = s i.i.d. N n + w n, 2 (, σ ), w n =,2,, N n s k n, k Compute the correlation between the signal s n and the data R n Compare it to a threshold The sufficient statistic is also called (for this case) a matched filter Performance is determined d by the signal-to-noise ratio (SNR), the ratio of the signal energy to the noise power J. A. O'S. ESE 524, Lecture 3, /2/9 2
(Somewhat) Practical Example Given an image, find the parts of the image that are different. Model: Gaussian data under either hypothesis. Under, variance is greater than under. Example: Image data. Background represents the null hypothesis model as Gaussian J. A. O'S. ESE 524, Lecture 3, /2/9 3
(Somewhat) Practical Example 8 istogram of Image 6 4 2 8 6 4 2 ( x ) 2 ij μ normsq = 2 2 σ { 3 I, normsq> ij = γ, otherwise 2 6 x 4 istogram of NormSq 5 4 5 2 25 3 35 4 45 5 55 6 5 J. A. O'S. ESE 5 524, 2 Lecture 25 3, 3/2/9 35 4
MtlbCd Matlab Code threshold=3; im=imread('passengersudsonplaneap.jpg','jpg'); im=sum(double(im),3); figure; imagesc(im); colormap gray; axis off [s,s2]=size(im); x=im(:2,:8); size(x) x=reshape(x,,2*8); [hx,ix]=hist(x,5); figure, plot(ix,hx); title('istogram of Image') mu=mean(x); sigma=std(x); normimage=(im-mu).^2/(2*sigma^2); normimage2=reshape(normimage,,numel(normimage)); [hx,ix]=hist(normimage2,5); figure, plot(ix,hx); title('istogram of NormSq') [xi,indexx]=find(normimage2>threshold); indexx]=find(normimage2>threshold); im2=reshape(im,,numel(im)); imagethresh=mu*ones(size(im2)); imagethresh(indexx)=im2(indexx); imagethresh=reshape(imagethresh,s,s2); figure; imagesc(imagethresh); colormap gray; axis off J. A. O'S. ESE 524, Lecture 3, /2/9 5
Minimax i Decision i Rule Find the decision rule that minimizes the maximum Bayes Risk over all possible priors. Criterion Bayes: mimize expected risk or cost min maxr R Z P Transition Priors Costs Densities P & P C ij Yes Yes Yes Minimum Probability of Error Yes Yes No Minimax Yes No Yes Neyman-Pearson Yes No No J. A. O'S. ESE 524, Lecture 3, /2/9 6
Minimax i Decision i Rule: Analysis For any fixed decision rule the risk is linear in P The maximum over P is achieved at an end point To make that end point as low as possible, the risk should be constant with respect to P To minimize that constant value, the risk should achieve the minimum risk at some P *. At that value of the prior, the best decision rule is a likelihood ratio test R M = C P P + C P P M M F F r ( R ) R Z P = p d P = pr ( R ) d R F Z J. A. O'S. ESE 524, Lecture 3, /2/9 7
Minimax i Decision i Rule: Analysis Bayes risk is concave (it is always below its tangent) Minimax is achieved at an end point or at an interior point on the Bayes risk curve where the tangent is zero 3 25 2 Risk 5 5..2.3.4.5.6.7.8.9 P J. A. O'S. ESE 524, Lecture 3, /2/9 8
.9.8 Minimax Decision Rule: 5 X.7 : x 5 e, X.6 Example X Matlab Code pf=:.:; pd=pf.^.2; eta=.2*(pf.^(-.8)); % eta=dp_d/dp_f figure plot(pf,pd);xlabel('p_f');ylabel('p_d') pd);xlabel('p cm=;cf=; pstar=./(+cm*eta/cf); riskoptimal=cm*(-pd).*pstar+cf*pf.*(-pstar); figure plot(pstar,riskoptimal,'b'), hold on p=:.:; r=cm*(-pd())*p+cf*pf()*(-p); plot(p,r,'r'), hold on r2=cm*(-pd(2))*p+cf*pf(2)*(-p); p+cf pf(2) (-p); plot(p,r2,'g'), hold on r3=cm*(-pd(3))*p+cf*pf(3)*(-p); plot(p,r3,'c') xlabel('p_');ylabel('risk') : x e, X l = 4X ln5 P = 5e dx = e F γ ' γ ' 5X 5 γ ' X P = e dx = e M P = P = P D.2.2 M( ) = F F F M C P C P F γ ' P D Risk.5.4.3.2...2.3.4.5.6.7.8.9 P F 3 25 2 5 5..2.3.4.5.6.7.8.9 P J. A. O'S. ESE 524, Lecture 3, /2/9 9
Neyman-Pearson Decision i Rule M ( α) F = P + η P F Minimize P M subject to P F α Variational approach Upper bound usually achieved Likelihood ratio test; threshold? = p ( ) d + η p ( ) d α r R R r R R Z Z Criterion Bayes: mimize expected risk or cost Transition Densities Priors P & P Costs C ij Yes Yes Yes Minimum Probability of Error Yes Yes No Minimax Yes No Yes Neyman-Pearson Yes No No J. A. O'S. ESE 524, Lecture 3, /2/9 2
Neyman-Pearson Decision i Rule M ( α) F = P + η P F Minimize P M subject to P F α Variational approach Upper bound usually achieved Likelihood ratio test; threshold? = p ( ) d + η p ( ) d α r R R r R R Z Z Plot the ROC: P D versus P F for the family of likelihood ratio tests Draw a vertical line where P F = α Find the corresponding P D At that point, the threshold equals the derivative of the ROC η = = dp dp dp dp D F D F dη dη J. A. O'S. ESE 524, Lecture 3, /2/9 2
Summary Several decision rules Likelihood (and loglikelihood) ratio test is optimal Receiver operating characteristic (ROC) plots probability of detection versus probability of false alarm with the threshold as a parameter all possible optimal performance Neyman-Pearson is a point on the ROC (P F = α) Minimax is a point on the ROC (P F C F =P M C M ) Probability of error is a point on the ROC (slope η = (-P )/P ) J. A. O'S. ESE 524, Lecture 3, /2/9 22