A GENERAL CLASS OF LOWER BOUNDS ON THE PROBABILITY OF ERROR IN MULTIPLE HYPOTHESIS TESTING. Tirza Routtenberg and Joseph Tabrikian

A GENERAL CLASS OF LOWER BOUNDS ON THE PROBABILITY OF ERROR IN MULTIPLE HYPOTHESIS TESTING Tirza Routtenberg and Joseph Tabrikian Department of Electrical and Computer Engineering Ben-Gurion University of the Negev, Beer-Sheva 8405, Israel Email: {tirzar,joseph}@ee.bgu.ac.il ABSTRACT In this paper, a new class of lower bounds on the probability of error for m-ary hypothesis tests is proposed. Computation of the imum probability of error which is attained by the maximum a-posteriori probability (MAP) criterion, is usually not tractable. The new class is derived using Hölder s inequality. The bounds in this class are continuous and differentiable function of the conditional probability of error and they provide good prediction of the imum probability of error in multiple hypothesis testing. It is shown that for binary hypothesis testing problem this bound asymptotically coincides with the optimum probability of error provided by the MAP criterion. This bound is compared with other existing lower bounds in several typical detection and classification problems in terms of tightness and computational complexity. Index Terms MAP, detection, lower bounds, hypothesis testing, probability of error. INTRODUCTION Lower bounds on the probability of error are of great importance in system design and performance analysis in many applications, such as signal detection, communications, and classification. It is well known that the imum probability of error is attained by the maximum a-posteriori probability (MAP) criterion, however, its probability of error is often difficult to calculate and usually not tractable. In such cases, lower bounds on the probability of error are useful for performance analysis, feasibility study and system design. These bounds can be useful also for derivation of analytical expressions for the Ziv-Zakai family of bounds for parameter estimation. One of the difficulties in computation of the Ziv-Zakai bounds is that they involve an expression for the imum probability of error of a binary hypothesis problem. Analytic expressions for lower bounds on the probability of error may be useful to simplify the calculation of the bound. Several lower bounds on the probability of error have been presented in the literature. The bounds can be divided into binary hypothesis bounds, 3 and general bounds for multiple-hypothesis bounds 4, 5, 6. The lower bounds presented in 4 and 7 are based on Fano 8 and Shannon inequalities, respectively. The relations between entropy and error probability have been used to derive the bounds in 5, 9. Lower bounds on the Bayes risk, which utilize distance measures between statistical distributions, 3, 6 can also be used as lower bounds on the probability of error. The lower bounds in, 3 are based on Bhattacharyya distance and have closed form expressions for many commonly used distributions, but their tightness are unsatisfactory in most cases. Devijver 6 introduced another bound in terms of the Bayesian distance. This bound is tighter than the Bhattacharyya bound and appropriate also for the multiple hypothesis testing. Practical and useful lower bounds on the probability of error are expected to be computationally simple, tight, and appropriate for general multi-hypothesis problems. In this paper, a new class of lower bounds with the aforementioned desired properties is derived using Hölder s inequality. The bounds in this class are simpler to compute than the optimum probability of error provided by the MAP criterion and they provide good prediction of the imum probability of error in multiple hypothesis testing. The tightest lower bound under this class of bounds is derived. It is shown that some existing lower bounds 5 can be derived from this family. In addition, for binary hypothesis testing problem this bound asymptotically coincides with the optimum probability of error provided by the MAP criterion. This bound is compared with other existing lower bounds. The paper is organized as follows. The basic idea of the bounding problem presented in Section. A brief review of existing lower bounds on the probability of error is presented in Section 3. The new class of bounds is derived in Section 4. The performances of the proposed bound for various examples is evaluated in Section 5. Finally, our conclusions appear in Section 6.. PROBLEM STATEMENT Consider an M-ary hypothesis testing problem, in which the hypotheses are θ i, i =,..., M with the corresponding

a-priori probabilities P (θ i ), i =,..., M, and the random observation vector is x. Let p(θ i x), i =,..., M denote the conditional probability of θ i given x, and f(x θ i ) and f(x, θ i ) are the conditional and joint probability density functions (pdf) of x and θ i, i =,..., M, respectively. The probability of error of the decision problem is denoted by. It is well known that the imum average probability of error, obtained by the MAP criterion, is given by 9 Pe = E max {P (θ i x)}. (),...,M However, the imum probability in () is often difficult to calculate and usually not tractable. Therefore, computable and tight lower bounds on the probability of error are useful for performance analysis and system design. 3. REVIEW OF EXISTING LOWER BOUNDS In this section, some existing lower bounds on the imum probability of error are presented. The following bounds have been derived especially for the binary hypothesis testing. Most of the binary hypothesis testing bounds are based on divergence measures of the difference between two probability distributions, known as f-divergences or Ali-Silvey distances 0. In, the divergence and two Bhattacharyya-based lower bounds were proposed. The divergence lower bound is B (div) = 8 e J/ () where J = E log L(x) E log L(x) and L(x) = f x θ (x θ ) f x θ (x θ ) is the likelihood ratio function, and E i log L(x) = log L(x)f x θi (x θ i )dx, i =,. x A simple Bhattacharyya-based lower bound is P E (θ x)p (θ x) B (BLB) =. (3) 8P (θ )P (θ ) This bound is always tighter than the divergence lower bound. The second Bhattacharyya-based bound on is B (BLB) = 0.5 0.5 4E P (θ x) P (θ x). (4) Another f-divergence bound is proposed in 3: B (f) = E (4P (θ x) P (θ x)) L (5) where L. For L = this bound can be obtained also by applying Jensen s inequality on the MAP probability of error. The harmonic lower bound was proposed in 5: B (HLB) = E P (θ x)p (θ x). (6) The Gaussian-Sinusoidal lower bound is given by B (Gauss sin) = 0.395E sin(πp (θ x)) exp{ α(p (θ x) 0.5) } (7) where α =.8063. Although this bound is tight, it is usually not tractable. An arbitrarily tight lower bound is given by B (AT LB) = α E log + e α e αp (θ x) + e αp (θ x) for any α > 0. By selecting high enough values for α, this lower bound can be made arbitrarily close to Pe. However, this bound is, in general, difficult to evaluate. For multiple hypothesis testing problems, the following lower bounds have been proposed. In 6, Devijver derived the following bounds using the conditional Bayesian distance: and B (Bayes) = M M where B θ x = M E ( ) MB θ x M (8) (9) B (Bayes) = B θ x (0) P (θ i x) stands for the conditional Bayesian distance. In 6, it is analytically shown that for the binary case the Bayesian distance lower bound in (9) is always tighter than the Bhattacharyya bound in (4). The bound in (0) is tighter than the bound 5, 6 B (Bayes3) = E M P (θ i x). () The bound B (quad) = B θ x () was proposed in 3 and 4 in the context of Vajdas quadratic entropy and the quadratic mutual information, respectively. In 6, it is claimed that B (quad) B (Bayes) B (Bayes). The bound B (quad) can be interpreted as an M-ary extension to the harmonic mean bound, presented in (6).

4. A NEW CLASS OF BOUNDS ON PROBABILITY OF ERROR 4.. Derivation of the new class of bounds Consider an M-ary hypothesis testing problem with detector ˆθ = ˆθ(x). Let u(x, θ) = ˆθ θ = { if ˆθ θ 0 if ˆθ = θ, (3) where θ is the true hypothesis and A is the indicator function of a subset A. Then, according to Hölder s inequality 5, for p, q, p + q = : E p u(x, θ) p E q v(x, θ) q E u(x, θ)v(x, θ) (4) for an arbitrary scalar function v(x, θ). It can be seen that E u(x, θ) p = E u(x, θ) = (5) where p. By substituting of (5) into (4) one obtains the following lower bound on the probability of error: Ep u(x, θ)v(x, θ) E p q v(x, θ) q. (6) Using (3) the expectation term in the numerator of (6) can be rewritten as E u(x, θ)v (x, θ) = E E u(x, θ)v (x, θ) x = E v (x, θ) E P (ˆθ x) v(x, ˆθ). (7) It can be shown that in order to obtain a valid bound which is independent of the detector ˆθ, v (x, θ) should be structured as follows v (x, θ) = ζ(x) (8) P (θ x) where ζ( ) is an arbitrary function. With no loss of generality ζ( ) can be chosen to be a nonnegative function. By substituting (8) in (7) one obtains E u(x, θ)v (x, θ) = E M ζ(x)p (θ i x) P (θ i x) Using (8), it can be seen that,θ i ˆθ = (M )E ζ(x). (9) E v (x, θ) q = E ζ q (x)g(x) (0) where g(x) = M (P (θ i x)) q. By substitution of (9) and (0) into (6) the bound can be rewritten as: (M ) p E p ζ(x) E p q ζq (x)g(x). () Maximization of () with respect to (w.r.t.) ζ( ), results ζ(x) = c g(x) q () and by substituting () in (), the attained lower bound is ( M ) q (M ) q q E (P (θ i x)) q (3) for all q >. 4.. Binary hypothesis testing In the binary hypothesis problem with the hypotheses θ and θ, the lower bound in (3) is P (θ x) P (θ x) E. (4) (P q (θ x) + P q (θ x)) q 4... Asymptotic properties It can be seen that the bound in (4) becomes tighter by increasing q, and for q the bound is E {P (θ x), P (θ x)} = E max {P (θ i x)}, (5) which is the optimal bound for the binary hypothesis test, presented in (). 4... The harmonic lower bound For q = this bound can be written by the following simple version: E x P (θ x) P (θ x) (6) which is identical to the harmonic lower bound in (6) and to B (quad) for the binary case in (). Thus, the binary lower bound in 5 can be interpreted as a special case of our general M-hypotheses bound, presented in (3). 4..3. Relation to upper bounds on imum probability of error In 5, an upper bound on the probability of error of MAP estimator for binary hypothesis testing is derived using the negative power mean inequalities. According to this paper: Pe q Ex ( ) q P (θ i x) q (7) for any q >. It can be seen that this upper bound can be obtained by multiplying the proposed lower bound in (3) by q. The factor of q controls the tightness between upper and lower bounds in the probability of error for binary hypothesis testing.

4..4. Bounds comparison Figure depicts the new lower bound for the binary hypothesis problem against the conditional probability P (θ x), for different values of the parameter q. The new bounds are compared to the bounds B (Gauss sin), B (AT LB) with α = 5, and B (Bayes3), presented in (7), (8), and (), respectively. It can be seen that the bound in (4) becomes tighter as q grows, and that for q = 0, the new bound is tighter than the other lower bounds almost everywhere. 0.5 0.45 0.4 0.35 Proposed bound q=0 Proposed bound q=5 Proposed bound q= B (Gauss sin) B (Bayes3) B (ATLB), α=5 where F is the hypergeometric function 6. Several bounds on the probability of error and the imum probability of error obtained by the MAP detector are presented in Fig. as a function of the distribution parameter, λ. The bounds in this figure are: B (BLB), B (BLB) B (Bayes), B (Bayes3) and the new lower bounds with q = and q = 3 presented in (3), (4), (9), (), and (6), respectively. It can be seen that for λ 0.8 the proposed bound with q = 3 is tighter than the Bhattacharyya lower bounds and is close to the imum probability of error obtained by the MAP decision rule. The proposed bound with q = is tighter than the B (BLB) lower bounds everywhere and tighter than the other bounds in some specific regions. In addition, the upper and lower bounds for q =, 3, obtained by (7) and (3), respectively, are presented in Fig. 3 as a function of the distribution parameter, λ. 0.3 0.5 0. 0.5 0. 0.05 0 0 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 P(θ x) bound 0 0 0 B (Bayes) B (Bayes3) B (BLB) B (BLB) Proposed bound, q= Proposed bound, q=3 Fig.. The proposed lower bounds for q =, 5, 0 and other existing bounds as a function of the conditional probability P (θ x) for binary hypothesis testing. 0 3 4 5 6 7 8 9 0 5. EXAMPLES In this section, two examples are presented to evaluate the performances of the new lower bounds on the imum probability of error derived in this paper. 5.. Binary hypothesis problem Consider the following binary hypothesis testing problem: θ : f(x θ ) = λ e λx u(x) θ : f(x θ ) = λ e λ x u(x) (8) where u( ) denotes the unit step function and with P (θ ) = P (θ ) = and λ = 0.5. For this problem, the proposed bounds with q = and q = 3 are B q= = ( F λ, ; λ λ ; λ ) λ λ λ λ λ B q=3 = λ F ( (λ λ ), ) ; λ (λ λ ) ; λ λ Fig.. Comparison of the different bounds and the exact imum probability of error as function of λ for two equallylikely exponential distribution hypotheses. 5.. Multiple hypothesis problem Consider the following multiple hypothesis testing problem: λ θ : f(x θ ) = 3 cos (x/)e x θ : f(x θ ) = sin (x/)e x θ 3 : f(x θ 3 ) = 5 4 sin (x)e x (9) with P (θ ) = 5 8, P (θ ) = 5 8, and P (θ 3) = 8 8. In this problem the exact probability of error of the MAstimator is difficult to compute. The bounds B (Bayes), B (Bayes), B (Bayes3), and B (quad) are not tractable. The proposed bound with q = is computable and it equal to B q= = 40 4 0 e x cos (x/) + sin (x/) + sin (x) dx = = 35 e x (cos(x) sin(x) 5) 0 = 0.86.

bound 0 0 0 proposed lower bound, q= proposed lower bound, q=3 upper bound, q= upper bound, q=3 0 3 4 5 6 7 8 9 0 λ Fig. 3. Comparison of the upper and lower bounds and the exact imum probability of error as function of λ for two equally-likely exponential distribution hypotheses. This example demonstrates the simplicity of the proposed bound with q =, while the other bounds are intractable. 6. CONCLUSION In this paper, a new class of lower bounds on the probability of error in multiple hypothesis testing was presented. These new bounds maintain the desirable properties of continuity, differentiability, and symmetry. In the binary case, the proposed class depends on a parameter, q which at the limit of infinity provides the imum attainable probability of error, provided by the MAP detector. It is shown that this class of bounds generalizes some existing bounds for binary hypothesis tests. It was shown via examples that the proposed bounds outperform other existing bounds in terms of tightness and simplicity of calculation. 6 P. A. Devijver, On a new class of bounds on Bayes risk in multihypothesis pattern recognition, IEEE Trans. Comput., vol. C-3, pp. 70 80, 974. 7 C. E. Shannon, Certain results in coding theory for noisy channels, Inform. Contr., vol., pp. 6 5, 957. 8 R. M. Fano, Class notes for Transmission of Information course 6.574, MIT, Cambridge, MA, 95. 9 M. Feder and N. Merhav, Relations between entropy and error probability, IEEE Trans. Inform. Theory, vol. 40, no., pp. 59 66, 994. 0 H. Poor and J. Thomas, Applications of Ali-Silvey distance measures in the design generalized quantizers for binary decision systems, IEEE Trans. Commun., vol. 5, no. 9, pp. 893 900, 977. W. A. Hashlamoun, Applications of distance measures and probability of error bounds to distributed detection systems, Ph.D. thesis, Syracuse Univ., Syracuse, 99. H. Avi-Itzhak and T. Diep, Arbitrarily tight upper and lower bounds on the Bayesian probability of error, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no., pp. 89 9, 996. 3 I. Vajda, Bounds of the imal error probability on checking a finite or countable number of hypotheses, Inform. Transmis. Problems, vol. 4, pp. 9 9, 968. 4 G. T. Toussaint, Feature evaluation criteria and contextual decoding algorithms in statistical pattern recognition, Ph.D. thesis, University of British Columbia, Vancouver, Canada, 97. 5 G. H. Hardy, J. E. Littlewood, and G. Polya, Inequalities., Cambridge, U.K.: Cambridge Univ. Press, nd ed. edition, 988. 6 M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover, New York, ninth dover printing, tenth GPO printing edition, 964. 7. REFERENCES D. Chazan, M. Zakai, and J. Ziv, Improved lower bounds on signal parameter estimation, IEEE Trans. Inform. Theory, vol. IT-, no., pp. 90 93, 975. T. Kailath, The divergence and Bhattacharyya distance measures in signal selection, IEEE Trans. Commun. Technol., vol. com-5, no., pp. 5 60, 967. 3 J. V. Tilburg and D. E. Boekee, Divergence bounds on key equivocation and error probability in cryptanalysis, Advances in cryptology CRYPTO 85, vol. 8, pp. 489 53, 986. 4 T. S. Han and S. Verdú, Generalizing the Fano inequality, IEEE Trans. Inform. Theory, vol. 40, no. 4, pp. 47 5, 994. 5 N. Santhi and A. Vardy, On an improvement over Rényi s equivocation bound, Proc. 44-th Allerton Conference on Communications, Control and Computing, pp. 8 4, 006.