Applications of Information Geometry to Hypothesis Testing and Signal Detection

CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016

Outline 1. Principles of Information Geometry 2. Geometry of Hypothesis Testing 3. Matrix CFAR Detection on Manifold of Symmetric Positive-Definite Matrices 4. Geometry of Matrix CFAR Detection

1. Principles of Information Geometry Important problems in statistics Distribution (likelihood): where x is a vector of data θ is a vector of unknowns p( x θ) 1. How much does the data x tell about the unknown θ? 2. How good is an estimator? 3. How to measure difference between two distributions? 4. How about the structure of a statistical model specified by a family of distributions? ˆθ 3

1. Principles of Information Geometry What is information geometry? Data Distributions Data processing Statistics Manifold Information Geometry 4

1. Principles of Information Geometry Information geometry is the study of intrinsic properties of manifolds of probability distributions by way of differential geometry. The main tenet of information geometry is that many important structures in information theory and statistics can be treated as structures in differential geometry by regarding a space of probabilities as a differentiable manifold endowed with a Riemannian metric and a family of affine connections. Information Theory Statistics Probability Theory Relationships with other subjects Physics Information Geometry Differential Geometry Systems Theory Riemannian Geometry 5

1. Principles of Information Geometry Statistical manifold Riemannian metric Affine connections Distance and geodesic 2 ds g ( ) T ij did j d ( ) d Curvatures R θ θ G θ θ l l l l s l s ijk j ik k ij js ik ks ij x x s, G, p( x θ) x R n, θ log p log p G E θ i j θ θ 1 jim ( θ) E jil(, ) ml(, ) E jl(, ) il(, ) ml(, ) θ x θ x θ θ 2 x θ x θ x θ 6

Outline 1. Principles of Information Geometry 2. Geometry of Hypothesis Testing 3. Matrix CFAR Detection on Manifold of Symmetric Positive-Definite Matrices 4. Geometry of Matrix CFAR Detection

2. Geometry of Hypothesis Testing 1)Start from target detection 0-30 -20-10 0 10 20 30-30 -20-10 0 10 20 30 100 90 80 70 60 50 40 30 20 10 P( x 0 ) P( x 1 ) x Hypothesis testing 8

2. Geometry of Hypothesis Testing 2)Likelihood ratio test Principles Make division of observation space Basic method The detector decides if the likelihood ratio exceeds a threshold L( x) p( x ) p( x ) 1 0 Essentials of signal detection Discrimination between two identically distributed distributions with different parameters. 1 L( x) P( x 0 ) p( x ) P( x 1 ) 1 p( x 0 ) Likelihood ratio test x 9

2. Geometry of Hypothesis Testing 3)Geometric interpretation of hypothesis testing 1 px;, exp 2 x 2 2 2 H 2, R ; 0 A B B C D A df ( A, B) D C df ( C, D) Familyofdistributions Statisticalmanifold Consider hypothesis testing from a geometric viewpoint 10

2. Geometry of Hypothesis Testing 3)Geometric interpretation of hypothesis testing Equivalence between LRT and Kullback-Leibler divergence x1, x2,, xn N qx ( ) p ( x ) 1 KLD Suppose are i.i.d. observations from a distribution, and there are two models (hypotheses) for, denoted by and. Then, the likelihood ratio is p ( x) 0 qx ( ) qx ( ) D( q p) q( x)ln dx px ( ) L N i1 0 p 1 1( xi ) p ( x ) Error exponent: Stein s lemma: i 0 K lim N Dq ( p) Dq ( p ) 0 1 1 0 1 N Minimum distance detector 1 log P M N K D( p0 p1) P M 2 NK 11

2. Geometry of Hypothesis Testing 3)Geometric interpretation of hypothesis testing The problem of hypothesis testing can be regarded as a discrimination problem where the decision is made by comparing distances from the signal distribution estimates to two hypotheses in the sense of the KL divergence, i.e., selecting the model that is closer to signal distribution estimates. x p( x 0 ) p( x θ0) d 0 p( x θ) d 1 p( x 1 ) X p( x θ ) Dq ( p) Dq ( p ) 0 1 1 1 0 S 1 N Minimum distance detector 12

Outline 1. Principles of Information Geometry 2. Geometry of Hypothesis Testing 3. Matrix CFAR Detection on Manifold of Symmetric Positive-Definite Matrices 4. Geometry of Matrix CFAR Detection

3. Matrix CFAR Detection 1)Constant false alarm rate detector Classical CFAR detector Decision by comparing the content of the cell under test with an adaptive threshold given by the arithmetic mean of the reference cells to achieve the desired constant probability of false alarm. x1 xn 21 x N 2 detectioncell a samples b targets x D xn 2 xn 21 Arithmetic mean Decision Threshold 0 1 c x N 0: target absent 1: target present 14

3. Matrix CFAR Detection 2)Matrix CFAR detector In 2008, F. Barbaresco proposed a generalized CFAR technique based on R1 Ri Ri 1 R N the manifold of symmetric R R positive definite (SPD) matrices. It has been proved that the Riemannian distance-based R 1 R N detector has better detection performance than the classical CFAR detector. R2 R D R Ri Ri 1 15

3. Matrix CFAR Detection 2)Matrix CFAR detector Riemannian distance between two SPD matrices d 2 n R1, R2 lnr1 R2R1 ln k 2 1 2 1 2 2 Riemannian center of N SPD matrices i1 where p=1, R denotes the median; p=2, R is the mean. The matrix CFAR detector N p R arg min wd R, R d RR, i R i i k 1 16

3. Matrix CFAR Detection 2)Matrix CFAR detector Initial spectra of measurements Mean spectra of measurements Intensity Classical detector Geometric detector 17

3. Matrix CFAR Detection 3)Robust matrix CFAR detector Two shortcomings of the Riemannian distance based matrix CFAR detector a) Computational cost is expensive for exponential operations in the calculation of Riemannian distance and its average; b) Riemannian mean and median are not robust to outlier. 18

3. Matrix CFAR Detection 3)Robust matrix CFAR detector Symmetrized Kullback-Leibler (skl) divergence based matrix CFAR detector Total Kullback-Leibler (tkl) divergence based matrix CFAR detector Sample Data Covariance Matrix R R1 CUT R Ri i+1 RN skl mean, skl median, tkl t center Divergence Computation >Threshold 19

3. Matrix CFAR Detection 3)Robust matrix CFAR detector skl divergence between two SPD matrices 1 1 skl R, R tr( R R R R 2 I) 1 2 1 2 2 1 skl mean of N SPD matrices R = R R N N 1 1 1 1 i k N i=1 N k=1 skl median of N SPD matrices 12 R k1 1, 1 i, skl Rk Ri j skl Rk Rj 1 N N 1 R R i j 12 20

3. Matrix CFAR Detection 3)Robust matrix CFAR detector The tkl divergence is a special case of the total Bregman divergence tbd, which is invariant to linear transformation. BD( xy, ) f x f y x y, f y tbd( x, y), f x f y x y f y BD x, y tbd x, y 1f y 2 BD x, y tbd x, y More robust 21

3. Matrix CFAR Detection 3)Robust matrix CFAR detector tkl divergence between two SPD matrices tkl R, R 1 2 tkl center of N SPD matrices R i wr 1 1 R1 R2 tr R2 R1 log det 2 log det R2 n1 log2 n 2 c log det R 4 2 1 1 i i i, w 2 log det i n1 log2 where R i 2 c log det R 4 2 j i j inversely proportional to the value of divergence gradient, which is robust to outliers i 1 2 22

3. Matrix CFAR Detection 3)Robust matrix CFAR detector Comparisons of dissimilarity measures between Riemannian distance, skl divergence and tkl divergence The signal-to-clutter ratio (SCR) is significantly improved by the mapping of tkl divergence. 23

3. Matrix CFAR Detection 3)Robust matrix CFAR detector Comparisons of detection performance between Riemannian distance, skl divergence and tkl divergence The tkl divergence based matrix CFAR detector has better performance. 24

3. Matrix CFAR Detection 3)Robust matrix CFAR detector Table I The time taken by different algorithms Algorithm Time (s) Riemannian mean detector 29.74 Riemannian median detector 41.66 skl mean detector 0.09 skl median detector 2.81 tkl t center detector 0.15 25

Outline 1. Principles of Information Geometry 2. Geometry of Hypothesis Testing 3. Matrix CFAR Detection on Manifold of Symmetric Positive-Definite Matrices 4. Geometry of Matrix CFAR Detection

4. Geometry of Matrix CFAR Detection Classical CFAR detector Euclidean space R 1 R Euclidean distance measure N Matrix CFAR detector R D R Ri 1 Matrix manifold Riemannian distance measure R2 Ri KL divergence, etc. A good detector should R R 1 N Properly characterize the R D R Ri 1 intrinsic structure of the R 2 R i measurement space Maximize the divergence between two hypotheses (clusters) 27

4. Geometry of Matrix CFAR Detection Future work Other divergences which have better performance to measure the dissimilarity between distributions Better approaches for clustering the distributions Detectors for heavy clutters Detectors for nonstationary clutters Detectors for few samples 28

Thank you for your attention!