Sparsity-based Cholesky Factorization and Its Application to Hyperspectral Anomaly Detection

Similar documents
RMT for whitening space correlation and applications to radar detection

Performances of Low Rank Detectors Based on Random Matrix Theory with Application to STAP

Tropical Graph Signal Processing

Comparison of Harmonic, Geometric and Arithmetic means for change detection in SAR time series

Widely Linear Estimation with Complex Data

Random matrix theory applied to low rank stap detection

Sparsity Measure and the Detection of Significant Data

LIMBO Self-Test Method using binary input and dithering signals

THE BAYESIAN ABEL BOUND ON THE MEAN SQUARE ERROR

A new simple recursive algorithm for finding prime numbers using Rosser s theorem

On the diversity of the Naive Lattice Decoder

Smart Bolometer: Toward Monolithic Bolometer with Smart Functions

Multiple sensor fault detection in heat exchanger system

Classification of high dimensional data: High Dimensional Discriminant Analysis

Hardware Operator for Simultaneous Sine and Cosine Evaluation

Linear Quadratic Zero-Sum Two-Person Differential Games

Dispersion relation results for VCS at JLab

Stator/Rotor Interface Analysis for Piezoelectric Motors

Full waveform inversion of shot gathers in terms of poro-elastic parameters

Nodal and divergence-conforming boundary-element methods applied to electromagnetic scattering problems

Sensitivity of hybrid filter banks A/D converters to analog realization errors and finite word length

Easter bracelets for years

Learning an Adaptive Dictionary Structure for Efficient Image Sparse Coding

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Vibro-acoustic simulation of a car window

Some explanations about the IWLS algorithm to fit generalized linear models

Voltage Stability of Multiple Distributed Generators in Distribution Networks

Two-step centered spatio-temporal auto-logistic regression model

A novel method for estimating the flicker level generated by a wave energy farm composed of devices operated in variable speed mode

Minimum variance portfolio optimization in the spiked covariance model

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Computable priors sharpened into Occam s razors

Solution to Sylvester equation associated to linear descriptor systems

Methylation-associated PHOX2B gene silencing is a rare event in human neuroblastoma.

Interactions of an eddy current sensor and a multilayered structure

Best linear unbiased prediction when error vector is correlated with other random vectors in the model

A Slice Based 3-D Schur-Cohn Stability Criterion

Small Sample Properties of Alternative Tests for Martingale Difference Hypothesis

On Solving Aircraft Conflict Avoidance Using Deterministic Global Optimization (sbb) Codes

From Unstructured 3D Point Clouds to Structured Knowledge - A Semantics Approach

Nonconcave Penalized Likelihood with A Diverging Number of Parameters

The beam-gas method for luminosity measurement at LHCb

Analysis of Boyer and Moore s MJRTY algorithm

Sound intensity as a function of sound insulation partition

Automatic detection of the number of Raypaths

Case report on the article Water nanoelectrolysis: A simple model, Journal of Applied Physics (2017) 122,

A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications

The sound power output of a monopole source in a cylindrical pipe containing area discontinuities

Ultra low frequency pressure transducer calibration

Exogenous input estimation in Electronic Power Steering (EPS) systems

Passerelle entre les arts : la sculpture sonore

Completeness of the Tree System for Propositional Classical Logic

Unbiased minimum variance estimation for systems with unknown exogenous inputs

Permutation-invariant regularization of large covariance matrices. Liza Levina

Finite volume method for nonlinear transmission problems

Full-order observers for linear systems with unknown inputs

Understanding SVM (and associated kernel machines) through the development of a Matlab toolbox

A Study of the Regular Pentagon with a Classic Geometric Approach

The constrained stochastic matched filter subspace tracking

A review on Sliced Inverse Regression

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

On size, radius and minimum degree

Solving an integrated Job-Shop problem with human resource constraints

Numerical Modeling of Eddy Current Nondestructive Evaluation of Ferromagnetic Tubes via an Integral. Equation Approach

Spatial representativeness of an air quality monitoring station. Application to NO2 in urban areas

Posterior Covariance vs. Analysis Error Covariance in Data Assimilation

Kernel Generalized Canonical Correlation Analysis

Properties of optimizations used in penalized Gaussian likelihood inverse covariance matrix estimation

Enhancing Fetal ECG Using Gaussian Process

Sparse Domain Adaptation in a Good Similarity-Based Projection Space

Evolution of the cooperation and consequences of a decrease in plant diversity on the root symbiont diversity

Low frequency resolvent estimates for long range perturbations of the Euclidean Laplacian

Optically Selected GRB Afterglows, a Real Time Analysis System at the CFHT

A Novel Aggregation Method based on Graph Matching for Algebraic MultiGrid Preconditioning of Sparse Linear Systems

Determination of absorption characteristic of materials on basis of sound intensity measurement

Non Linear Observation Equation For Motion Estimation

Diurnal variation of tropospheric temperature at a tropical station

Predicting the risk of non-compliance to EMC requirements during the life-cycle

Polychotomous regression : application to landcover prediction

RHEOLOGICAL INTERPRETATION OF RAYLEIGH DAMPING

Thomas Lugand. To cite this version: HAL Id: tel

Can we reduce health inequalities? An analysis of the English strategy ( )

Electromagnetic characterization of magnetic steel alloys with respect to the temperature

A Simple Proof of P versus NP

Computationally efficient banding of large covariance matrices for ordered data and connections to banding the inverse Cholesky factor

A new multidimensional Schur-Cohn type stability criterion

On Politics and Argumentation

On production costs in vertical differentiation models

A new approach of the concept of prime number

On the longest path in a recursively partitionable graph

Adaptive backstepping for trajectory tracking of nonlinearly parameterized class of nonlinear systems

Chebyshev polynomials, quadratic surds and a variation of Pascal s triangle

A simple test to check the optimality of sparse signal approximations

ON THE UNIQUENESS IN THE 3D NAVIER-STOKES EQUATIONS

Fast adaptive ESPRIT algorithm

Analyzing large-scale spike trains data with spatio-temporal constraints

Sparse filter models for solving permutation indeterminacy in convolutive blind source separation

The influence of the global atmospheric properties on the detection of UHECR by EUSO on board of the ISS

STATISTICAL ENERGY ANALYSIS: CORRELATION BETWEEN DIFFUSE FIELD AND ENERGY EQUIPARTITION

The H infinity fixed-interval smoothing problem for continuous systems

Transcription:

Sparsity-based Cholesky Factorization and Its Application to Hyperspectral Anomaly Detection Ahmad W. Bitar, Jean-Philippe Ovarlez, Loong-Fah Cheong To cite this version: Ahmad W. Bitar, Jean-Philippe Ovarlez, Loong-Fah Cheong. Sparsity-based Cholesky Factorization and Its Application to Hyperspectral Anomaly Detection. IEEE CAMSAP 7, Dec 7, Curaçao, Dutch Antilles, Netherlands. <hal-656954> HAL Id: hal-656954 https://hal-mines-paristech.archives-ouvertes.fr/hal-656954 Submitted on 6 Dec 7 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Sparsity-based Cholesky Factorization and Its Application to Hyperspectral Anomaly Detection Ahmad W. Bitar, Jean-Philippe Ovarlez and Loong-Fah Cheong SONDRA/CentraleSupélec, Plateau du Moulon, 3 rue Joliot-Curie, F-99 Gif-sur-Yvette, France ONERA, DEMR/TSI, Chemin de la Hunière, 9 Palaiseau, France National University of Singapore (NUS), Singapore, Singapore Abstract Estimating large covariance matrices has been a longstanding important problem in many applications and has attracted increased attention over several decades. This paper deals with two methods based on pre-existing works to impose sparsity on the covariance matrix via its unit lower triangular matrix (aka Cholesky factor) T. The first method serves to estimate the entries of T using the Ordinary Least Squares (OLS), then imposes sparsity by exploiting some generalized thresholding techniques such as Soft and Smoothly Clipped Absolute Deviation (). The second method directly estimates a sparse version of T by penalizing the negative normal log-likelihood with L and penalty functions. The resulting covariance estimators are always guaranteed to be positive definite. Some Monte- Carlo simulations as well as experimental data demonstrate the effectiveness of our estimators for hyperspectral anomaly detection using the Kelly anomaly detector. Keywords Hyperspectral anomaly detection, covariance matrix, sparsity, Cholesky factor. I. INTRODUCTION An airborne hyperspectral imaging sensor is capable of simultaneously acquiring the same spatial scene in a contiguous and multiple narrow spectral wavelength (color) bands [], [], [3]. When all the spectral bands are stacked together, the resulting hyperspectral image (HSI) is a three dimensional data cube; each pixel in the HSI is a p-dimensional vector, x = [x,, x p ] T R p, where p designates the total number of spectral bands. With the rich information afforded by the high spectral dimensionality, hyperspectral imagery has found many applications in various fields such as agriculture [4], [5], mineralogy [6], military [7], [8], [9], and in particular, target detection [], [], [], [], [7], [], [3]. In many situations of practical interest, we do not have sufficient a priori information to specify the statistics of the target class. More precisely, the target s spectra is not provided to the user. This unknown target is referred as «anomaly» [4] having a very different spectra from the background (e.g., a ship at sea). Different Gaussian-based anomaly detectors have been proposed in the literature [5], [6], [7], [8], [9]. The detection performance of these detectors mainly depend on the true unknown covariance matrix (of the background surrounding the test pixel) whose entries have to be carefully estimated specially in large dimensions. Due to the fact that in hyperspectral imagery, the number of covariance matrix parameters to estimate grows with the square of the spectral dimension, it becomes impractical to use traditional covariance estimators where the target detection performance can deteriorate significantly. Many a time, the researchers assume that compounding the large dimensionality problem can be alleviated by leveraging on the assumption that the true unknown covariance matrix is sparse, namely, many entries are zero. This paper outlines two simple methods based on preexisting works in order to impose sparsity on the covariance matrix via its Cholesky factor T. The first method imposes sparsity by exploiting thresholding operators such as Soft and on the OLS estimate of T. The second method directly estimates a sparse version of T by penalizing the negative normal log-likelihood with L and penalty functions. Summary of Main Notations: Throughout this paper, we depict vectors in lowercase boldface letters and matrices in uppercase boldface letters. The notation (.) T stands for the transpose, while., (.), (.), det(.), and are the absolute value, the inverse, the derivative, the determinant, and indicator function, respectively. For any z R, we define sign(z) = if z >, sign(z) = if z = and sign(z) = if z <. II. BACKGROUND AND SYSTEM OVERVIEW Suppose that we observe a sample of n independent and identically distributed p-random vectors, {x i } i [, n], each follows a multivariate Gaussian distribution with zero mean and unknown covariance matrix = [σ g,l ] p p. The first traditional estimator we consider in this paper is the Sample Covariance Matrix (SCM), defined as ˆ SCM = [ˆσ g,l ] p p = x i x T i. n i= In order to address the positivity definiteness constraint problem of ˆ SCM, Pourahmadi [] has modeled the covariance matrices via linear regressions. This is done by letting ˆx = [ˆx,..., ˆx p ] T R p, and consider each element ˆx t, t [, p], as the linear least squares predictor of x t based on its t predecessors {x j } j [, t ]. In particular, for t [, p], let t ˆx t = C t,j x j, TT T = D. () j= where T is a unit lower triangular matrix with C t,j in the (t, j)th position for t [, p] and j [, t ], and D is a diagonal matrix with θt = var(ɛ t ) as its diagonal entries, where ɛ t = x t ˆx t is the prediction error for t [, p]. Note that for t = [, let ˆx = E(x ) =, and hence, var(ɛ ) = θ = E (x ) ]. Given a sample {x i } i [, n], with n > p, a natural estimate of T and D, denoted as ˆT OLS and ˆD OLS in this paper, is simply done by plugging in the OLS estimates of the regression coefficients and residual variances

in (), respectively. In this paper, we shall denote the second traditional estimator by ˆ OLS = ˆT ˆD OLS ˆT T OLS OLS. Obviously, when the spectral dimension p is considered large compared to the number of observed data n, both ˆ SCM and ˆ OLS face difficulties in estimating without an extreme amount of errors. Realizing the challenges brought by the high dimensionality, researchers have thus circumvent these challenges by proposing various regularization techniques to consistently estimate based on the assumption that the covariance matrix is sparse. Recently, Bickel et al. [] proposed a banded version of ˆ SCM, denoted as B m ( ˆ SCM ) in this paper, with B m ( ˆ SCM ) = [ˆσ g,l ( g l m)], where m < p is the banding parameter. However, this kind of regularization does not always guarantee positive definiteness. In [], a class of generalized thresholding operators applied on the off-diagonal entries of ˆ SCM have been discussed. These operators combine shrinkage with thresholding and have the advantage to estimate the true zeros as zeros with high probability. These operators (e.g., Soft and ), though simple, do not always guarantee positive definiteness of the thresholded version of ˆ SCM. In [3], the covariance matrix is constrained to have an eigen decomposition which can be represented as a sparse matrix transform (SMT) that decomposes the eigen-decomposition into a product of very sparse transformations. The resulting estimator, denoted as ˆ SMT in this paper, is always guaranteed to be positive definite. In addition to the above review, some other works have attempted to enforce sparsity on the covariance matrix via its Cholesky factor T. Hence, yielding sparse covariance estimators that are always guaranteed to be positive definite. For example, in [4], Pourahmadi et al. proposed to smooth the first few subdiagonals of ˆT OLS and set to zero the remaining subdiagonals. In [5], Huang et al. proposed to directly estimate a sparse version of T by penalizing the negative normal log-likelihood with a L -norm penalty function. Hence, allowing the zeros to be irregularly placed in the Cholesky factor. This seems to be an advantage over the work in [4]. We put forth two simple methods for imposing sparsity on the covariance matrix via its Cholesky factor T. The first method is based on the work in [], but attempts to render ˆ OLS sparse by thresholding its Cholesky factor ˆT OLS using operators such as Soft and. The second method aims to generalize the work in [5] in order to be used for various penalty functions. The two methods allow the zeros to be irregularly placed in the Cholesky factor. Clearly, in real world hyperspectral imagery, the true covariance model is not known, and hence, there is no prior information on its degree of sparsity. However, enforcing sparsity on the covariance matrix seems to be a strong assumption, but can be critically important if the true covariance model for a given HSI is indeed sparse. That is, taking advantage of the possible sparsity in the estimation can potentially improve the target detection performance, as can be seen from the experimental results later. On the other hand, while the true covariance model may not be sparse (or not highly sparse), there should be no worse detection results than to those of the traditional estimators ˆ SCM and ˆ OLS. We evaluate our estimators for hyperspectral anomaly detection using the Kelly anomaly detector [6]. More precisely, we first perform a thorough evaluation of our estimators on some Monte-Carlo simulations for three true covariance models of different sparsity levels. From our experiments in Subsection IV-A, the detection results show that in trully sparse models, our estimators improve the detection significantly with respect to the traditional ones, and have competitive results with state-of-the-art [], [], [3]. When the true model is not sparse, we find that empirically our estimators still have no worse detection results than to those of ˆ SCM and ˆ OLS. Next, in Subsection IV-B, our estimators are evaluated on experimental data where a good target detection performances are obtained comparing to the traditional estimators and stateof-the-art. In all the experiments later, we observe that ˆ OLS achieves higher target detection results than to those of ˆ SCM. III. MAIN CONTRIBUTIONS Before describing the two methods, we want to recall the definition for ˆ OLS. Given a sample {x i } i [, n], we have: t x i,t = C t,j x i,j + ɛ i,t t [, p], i [, n]. () j= By writing () in vector-matrix form for any t [, p], one obtains the simple linear regression model: y t = A n,t + e t, (3) where y t = [x,t,, x n,t ] T R n, A n,t = [x i,j ] n (t ), = [C t,,, C t,t ] T R (t ), and e t = [ɛ,t,, ɛ n,t ] T R n. When n > p, the OLS estimate of and the corresponding residual variance are plugged in T and D for each t [, p], respectively. At the end, one obtains the estimator ˆ OLS = ˆT ˆD OLS ˆT T OLS OLS. Note that ˆT OLS has -ĈOLS t,j in the (t, j)th position for t [, p] and j [, t ]. A. Generalized thresholding based Cholesky Factor For any λ, we define a matrix thresholding operator T h(.) and denote by T h( ˆT OLS ) = [T h( ĈOLS t,j )] p p the matrix resulting from applying a specific thresholding operator OLS T h(.) {Soft, } to each element of the matrix ˆT for t [, p] and j [, t ]. We consider the following minimization problem: { } p t T h( ˆT OLS ) T (ĈOLS t,j C t,j ) + p λ { C t,j } (4) t= j= where p λ {p L λ, p λ,a> }. We have p L λ ( C t,j ) = λ C t,j, and p λ,a> ( C t,j ) = λ C t,j if C t,j λ C t,j aλ Ct,j +λ (a ) if λ < C t,j aλ. (a+)λ if C t,j > aλ Solving (4) with p L λ and p λ,a>, yields a closed-form Soft and thresholding rules, respectively [], [7]. The value a = 3.7 was recommended by Fan and Li [7]. Despite the application here is different than in [7], for simplicity, we use the same value throughout the paper. We shall designate the two thresholded matrices by ˆT Soft and ˆT, that apply Soft and on ˆT OLS, respectively. We denote our first two estimators as:

ˆ Soft OLS = ˆT ˆD Soft ˆT T OLS Soft ˆ OLS = ˆT ˆD OLS ˆT T Note that in [], the authors have demonstrated that for a non sparse true covariance model, the covariance matrix does not suffer any degradation when thresholding is applied to the offdiagonal entries of ˆ SCM. However, this is not the case for the target detection problem where the inverse covariance is used; we found that, and in contrast to our estimators, the scheme in [] has a deleterious effect on the detection performance when compared to those of ˆ SCM and ˆ OLS. B. A generalization of the estimator in [5] We present the same concept in [5], but by modifying the procedure by which the entries of T have been estimated. Note that det(t) = and = T D T T. It follows that det() = det(d) = p θt. Hence, the negative normal t= log-likelihood of X = [x,, x n ] R p n, ignoring an irrelevant constant, satisfies: Λ = log(l(, x,, x n )) = n log(det(d)) + X T (T T D T) X = n log(det(d)) + (T X) T D (T X) = n p log θ t + p ɛ i,t /θ t. By adding a penalty function t= t= i= p t p α { C t,j } to Λ, where p α {p L α, p α,a> } (see t= j= subsection III. A) with α [, ), we have: ( ) n log θ ɛ p i, + θ + n log θ ɛ t i,t t + i= θ + p α{ C t,j } (5) t= i= t j= Obviously, minimizing (5) with respect to θ and θt gives the solutions ˆθ = n ɛ i, = n x i, and ˆθ t = n ɛ i,t = n i= (x i,t t j= i= i= C t,j x i,j ), respectively. It remains to estimate the entries of T by minimizing (5) with respect to. From equation () and (3), the minimization problem to solve for each t [, p] is: θt i= ɛ ˆβ t i,t t + p α{ C t,j } θ i= t j= ( ) t t x i,t C t,jx i,j + p α{ C t,j } i= θt j= j= t y t A n,t F + p α{ C t,j } By denoting l( ) = y θt t A n,t F and r() = t p α { C t,j } = t r j (C t,j ), we solve iteratively using j= j= the General Iterative Shrinkage and Thresholding (GIST) algorithm [8]: ˆβ (k+) t l(β (k) t j= ) + r( ) + ( l(β (k) t )) T ( β (k) t ) + w(k) β(k) t u (k) t + w (k) r( ) (6) (7) where u (k) t = β (k) t l(β (k) t )/w (k), and w (k) is the step size initialized using the Barzilai-Browein rule [9]. By decomposing (7) into independent (t-) univariate optimization problems, we have for j =,, t : C (k+) t,j C t,j C t,j u (k) t,j + w (k) rj(ct,j) (8) where u (k) t = [u (k) t,,, u(k) t,t ]T R (t ). By solving (8) with the L -norm penalty, p L α, we have the following closed form solution: t,j,(l = ) sign(u(k) t,j ) max(, u(k) t,j α/w(k) ) (9) C (k+) For the penalty function, p α,a>, we can observe that it contains three parts for three different conditions (see Subsection III-A). In this case, by recasting problem (8) into three minimization sub-problems for each condition, and after solving them, one can obtain the following three sub-solutions h t,j, h t,j, and h3 t,j, where: h t,j = sign(u(k) t,j ) min(α, max(, u(k) t,j α/w(k) )), h t,j = sign(u(k) t,j ) min(aα, max(α, w(k) u (k) h 3 t,j = sign(u(k) t,j ) max(aα, u(k) t,j ). t,j (a ) aα )), w (k) (a ) Hence, we have the following closed form solution: C (k+) t,j,() (q t,j u (k) t,j ) + q t,j w s.t. (k) rj(qt,j) q t,j {h t,j, h t,j, h 3 t,j} At the end, we denote our last two estimators as: ˆ L ˆ = = ˆT L ˆD ˆT T L T ˆT ˆD ˆT () where ˆT L and ˆT have respectively Ĉt,j,(L ) and Ĉt,j,() in the (t, j)th position for t [, p] and j [, t ], whereas ˆD has the entries (ˆθ, ˆθ t ) on its diagonal. Note that in [5], the authors have used the local quadratic approximation (LQA) of the L -norm in order to get a closed form solution for in equation (6). Our algorithm is now more general since after exploiting the GIST algorithm to solve (6), it can be easily extended to some other penalties such as [7], Capped-L penalty [3], [3], [3], Log Sum Penalty[33], Minimax Concave Penalty [34] etc. and they all have closed-form solutions [8]. In this paper, we are only interested to the L and penalty functions. IV. HYPERSPECTRAL ANOMALY DETECTION We first describe the Kelly anomaly detector [6] used for the detection evaluation. Next, we present two subsections to Soft gauge the detection performances of our estimators { ˆ OLS, ˆ OLS, ˆL, ˆ } when compared to the traditional ones { ˆ SCM, ˆOLS } and state-of-the-art: ˆSMT [3], B k ( ˆ SCM ) [], and the two estimators that apply Soft and thresholding on the off-diagonal entries of ˆ SCM in [], and which will be denoted in the following experiments Soft as ˆ SCM and ˆ SCM, respectively. Note that the tuning

54 54 54 Not known Models Model Model Model 3 MUSE.7976.7977.7978.677 33 36 59.6575 48 4 69 6 48 4 57 643 L 59 64 36 844 59 64 6 844 SM T 53 84.7798.7879 Table. A List of Area Under Curve (AUC) values of our estimators,, L,...6.6.7...6...6.7...7.7...6.7.. (a).6.7 (b) (c) ROC curves three Models. (a) Model. (b) Model. (c) Model 3. Fig.. parameter λ (in subsection III-A) and α (in subsection III-B) are chosen automatically using a 5-fold crossvalidated loglikelihood procedure (see Subsection 4. in [5] for details). Suppose the following signal model: H : x = n, H : x = γ d + n, i =,, n i =,, n xi = ni, xi = ni, () where n,, nn are n i.i.d p-vectors, each follows a multivariate Normal distribution N (, ). d is an unknown steering vector and which denotes the presence of an anomalous signal with unknown amplitude γ >. After some calculation (refer to [6] and both Subsection II. B and Remark II. in [35] for details), the Kelly anomaly detector is described as follows: H DKellyAD (x) = xt x δ, () H where δ is a prescribed threshold value. In the following two subsections, the detection performances of the estimators, when are plugged in DKellyAD, are evaluated by the Receiver Operating Characteristics (ROC) curves and their corresponding Area Under Curves (AUC) values. 3.7 4.6 5 6 7 8. 9. 3 4 5 6 7 8 9...6.7 (a) Fig.. (b) (c) (a) MUSE HSI (average). (b) ROC curves for MUSE. (c) Legend. A. Monte-Carlo simulations The experiments are conducted on three covariance models: Model : = I, the identity matrix, Model : the Autoregressive model order, AR(), = [σgl ]p p, where σgl = c g l, for c =, Model 3: = [σgl ]p p, where σgl = ( (( g l )/r))+, for r = p/: the triangular matrix. Bk ( ) 59 478 3 77 59 74 969.78 59 7 78.78 when compared to some others. Model is very sparse and model is approximately sparse. Model 3 with r = p/ is considered the least sparse [] among the three models we consider. The computations have been made through 5 Monte-Carlo trials and the ROC curves are drawn for a signal to noise ratio equal to 5dB. We choose n = 8 for covariance estimation under Gaussian assumption, and set p = 6. The artificial anomaly we consider is a vector containing normally distributed pseudorandom numbers (to have fair results, the same vector is used for the three models). The ROC curves for Model, and 3 are shown in Fig. (a), (b) and (c), respectively, and their corresponding AUC values are presented in Table. For a clear presentation of the figures, we only exhibit the ROC curves for,,, and. The highest AUC values are shown in bold in Table. For both Model and, our estimators significantly improve the detection performances comparing to those of the traditional estimators (, ), and have competitive detection results with state-of-the-art. An important finding is that even for a non sparse covariance model (that is, Model 3), our estimators do not show a harm on the detection when compared to those of,. Despite, and L have slightly lower AUC values than for, this is still a negligible degradation on the detection. Thus, considering that, and L have no worse detection results than to that of is still acceptable. B. Application on experimental data Our estimators are now evaluated for galaxy detection on the Multi Unit Spectroscopic Explorer (MUSE) data cube (see [36]). It is a image and consists of 36 bands in wavelengths ranging from 465-93 nm. We used one band of each 6, so that 6 bands in total. Figure (a) exhibits the mean power in db over the 6 bands. The covariance matrix is estimated using a sliding window of size 9 9, having n = 8 secondary data (after excluding only the test pixel). The mean has been removed from the given HSI. Figure (b) exhibits the ROC curves [37] of our estimators when compared to some others, and their AUC values are shown in Table. Note that the curves for SM T, and are not drawn but their AUC valuessof aret shown in Table. The estimators, achieve higher detection results than for all the others, whereas both L and achieve only a lower AUC values than for Bk ( ). V. C ONCLUSION Two methods are outlined to impose sparsity on the covariance matrix via its Cholesky factor T. Some MonteCarlo simulations as well as experimental data demonstrate the effectiveness (in terms of anomaly detection) of the two proposed methods using the Kelly anomaly detector.

REFERENCES [] G. Shaw and D. Manolakis, Signal processing for hyperspectral image exploitation, IEEE Signal Processing Magazine, vol. 9, no., pp. 6, Jan. [] D. Manolakis, D. Marden, and G. Shaw, Hyperspectral image processing for automatic target detection applications, Lincoln Laboratory Journal, vol. 4, no., pp. 79 6, 3. [3] D. G. Manolakis, R. B. Lockwood, and T. W. Cooley, Hyperspectral Imaging Remote Sensing: Physics, Sensors, and Algorithms. Cambridge University Press, 6. [4] N. K. Patel, C. Patnaik, S. Dutta, A. M. Shekh, and A. J. Dave, Study of crop growth parameters using airborne imaging spectrometer data, International Journal of Remote Sensing, vol., no., pp. 4 4,. [Online]. Available: http://www.tandfonline.com/doi/abs/.8/4367383 [5] B. Datt, T. R. McVicar, T. G. van Niel, D. L. B. Jupp, and J. S. Pearlman, Preprocessing eo- hyperion hyperspectral data to support the application of agricultural indexes, IEEE Transactions on Geoscience and Remote Sensing, vol. 4, pp. 46 59, Jun. 3. [6] B. Hörig, F. Kühn, F. Oschütz, and F. Lehmann, HyMap hyperspectral remote sensing to detect hydrocarbons, International Journal of Remote Sensing, vol., pp. 43 4, May. [7] D. Manolakis and G. Shaw, Detection algorithms for hyperspectral imaging applications, Signal Processing Magazine, IEEE, vol. 9, no., pp. 9 43,. [8] D. W. J. Stein, S. G. Beaven, L. E. Hoff, E. M. Winter, A. P. Schaum, and A. D. Stocker, Anomaly detection from hyperspectral imagery, IEEE Signal Processing Magazine, vol. 9, no., pp. 58 69, Jan. [9] M. T. Eismann, A. D. Stocker, and N. M. Nasrabadi, Automated hyperspectral cueing for civilian search and rescue, Proceedings of the IEEE, vol. 97, no. 6, pp. 3 55, June 9. [] D. Manolakis, E. Truslow, M. Pieper, T. Cooley, and M. Brueggeman, Detection algorithms in hyperspectral imaging systems: An overview of practical algorithms, IEEE Signal Processing Magazine, vol. 3, no., pp. 4 33, Jan 4. [] D. Manolakis, R. Lockwood, T. Cooley, and J. Jacobson, Is there a best hyperspectral detection algorithm? Proc. SPIE 7334, p. 7334, 9. [] J. Frontera-Pons, F. Pascal, and J. P. Ovarlez, False-alarm regulation for target detection in hyperspectral imaging, in Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 3 IEEE 5th International Workshop on, Dec 3, pp. 6 64. [3] J. Frontera-Pons, M. A. Veganzones, S. Velasco-Forero, F. Pascal, J. P. Ovarlez, and J. Chanussot, Robust anomaly detection in hyperspectral imaging, in 4 IEEE Geoscience and Remote Sensing Symposium, July 4, pp. 464 467. [4] S. Matteoli, M. Diani, and G. Corsini, A tutorial overview of anomaly detection in hyperspectral images, Aerospace and Electronic Systems Magazine, IEEE, vol. 5, no. 7, pp. 5 8,. [5] I. S. Reed and X. Yu, Adaptive multiple-band CFAR detection of an optical pattern with unknown spectral distribution, IEEE Transactions on Acoustics Speech and Signal Processing, vol. 38, pp. 76 77, Oct. 99. [6] E. J. Kelly, An adaptive detection algorithm, IEEE Transactions on Aerospace and Electronic Systems, vol. AES-, no., pp. 5 7, March 986. [7] C.-I. Chang and S.-S. Chiang, Anomaly detection and classification for hyperspectral imagery, IEEE Transactions on Geoscience and Remote Sensing, vol. 4, no. 6, pp. 34 35, Jun. [8] J. C. Harsanyi, Detection and classification of subpixel spectral signatures in hyperspectral image sequences. Ph.D. dissertation, Dept. Electr. Eng., Univ. Maryland, Baltimore, MD, USA, 993. [9] J. Frontera-Pons, F. Pascal, and J. P. Ovarlez, Adaptive nonzero-mean gaussian detection, IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no., pp. 7 4, Feb 7. [] M. Pourahmadi, Joint mean-covariance models with applications to longitudinal data: unconstrained parameterisation, Biometrika, vol. 86, no. 3, pp. 677 69, 999. [] P. J. Bickel and E. Levina, Regularized estimation of large covariance matrices, The Annals of Statistics, vol. 36, no., pp. 99 7, 8. [] A. J. Rothman, E. Levina, and J. Zhu, Generalized thresholding of large covariance matrices, Journal of the American Statistical Association, vol. 4, no. 485, pp. 77 86, 9. [3] G. Cao and C. Bouman, Covariance estimation for high dimensional data vectors using the sparse matrix transform, in Advances in Neural Information Processing Systems. Curran Associates, Inc., 9, pp. 5 3. [4] W. B. Wu and M. Pourahmadi, Nonparametric estimation of large covariance matrices of longitudinal data, Biometrika, vol. 9, no. 4, p. 83, 3. [Online]. Available: + http://dx.doi.org/.93/biomet/9.83 [5] J. Z. Huang, N. Liu, M. Pourahmadi, and L. Liu, Covariance matrix selection and estimation via penalised normal likelihood, Biometrika, vol. 93, no., pp. 85 98, 6. [6] E. J. Kelly, An adaptive detection algorithm, Aerospace and Electronic Systems, IEEE Transactions on, vol. 3, no., pp. 5 7, November 986. [7] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, vol. 96, no. 456, pp. 348 36,. [8] P. Gong, C. Zhang, Z. Lu, J. Huang, and J. Ye, Gist: General iterative shrinkage and thresholding for non-convex sparse learning, 3. [9] J. Barzilai and J. M. Borwein, Two-point step size gradient methods, IMA Journal of Numerical Analysis, vol. 8, no., pp. 4 48, 988. [3] T. Zhang, Analysis of multi-stage convex relaxation for sparse regularization, J. Mach. Learn. Res., vol., pp. 8 7, Mar.. [Online]. Available: http://dl.acm.org/citation.cfm?id=7566.7564 [3], Multi-stage convex relaxation for feature selection, Bernoulli, vol. 9, no. 5B, pp. 77 93, 3. [Online]. Available: http://dx.doi.org/5/-bej45 [3] C. Z. P. Gong, J. Ye, Multi-stage multi-task feature learning, in Advances in Neural Information Processing Systems 5. Curran Associates, Inc.,, pp. 988 996. [33] E. Candès, M. B. Wakin, and S. P. Boyd, Enhancing sparsity by reweighted l minimization, Journal of Fourier Analysis and Applications, vol. 4, no. 5, pp. 877 95, 8. [34] C.-H. Zhang, Nearly unbiased variable selection under minimax concave penalty, Ann. Statist., vol. 38, no., pp. 894 94, 4. [Online]. Available: http://dx.doi.org/.4/9-aos79 [35] J. Frontera-Pons, M. A. Veganzones, F. Pascal, and J. P. Ovarlez, Hyperspectral anomaly detectors using robust estimators, Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of, vol. 9, no., pp. 7 73, Feb 6. [36] official website of the MUSE Project, http://muse.univ-lyon.fr/. [37] Y. Zhang, B. Du, and L. Zhang, A sparse representation-based binary hypothesis model for target detection in hyperspectral images, IEEE TGRS, vol. 53, no. 3, pp. 346 354, March 5.