Asymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling

Similar documents
Shannon-Theoretic Limits on Noisy Compressive Sampling Mehmet Akçakaya, Student Member, IEEE, and Vahid Tarokh, Fellow, IEEE

Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise

A Statistical Analysis of Fukunaga Koontz Transform

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

WE study the capacity of peak-power limited, single-antenna,

arxiv: v1 [math.na] 26 Nov 2009

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

An Alternative Proof for the Capacity Region of the Degraded Gaussian MIMO Broadcast Channel

Tractable Upper Bounds on the Restricted Isometry Constant

Minimax MMSE Estimator for Sparse System

THE representation of a signal as a discrete set of

Near Ideal Behavior of a Modified Elastic Net Algorithm in Compressed Sensing

The Pros and Cons of Compressive Sensing

Solution-recovery in l 1 -norm for non-square linear systems: deterministic conditions and open questions

SIGNALS with sparse representations can be recovered

Wavelet Footprints: Theory, Algorithms, and Applications

Areproducing kernel Hilbert space (RKHS) is a special

ACCORDING to Shannon s sampling theorem, an analog

Generalized Orthogonal Matching Pursuit- A Review and Some

On the Behavior of Information Theoretic Criteria for Model Order Selection

Compressed Sensing and Affine Rank Minimization Under Restricted Isometry

Optimal Mean-Square Noise Benefits in Quantizer-Array Linear Estimation Ashok Patel and Bart Kosko

A Log-Frequency Approach to the Identification of the Wiener-Hammerstein Model

Uniqueness Conditions for A Class of l 0 -Minimization Problems

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only

Estimation Error Bounds for Frame Denoising

ETNA Kent State University

Upper Bounds on the Capacity of Binary Intermittent Communication

MULTIPLEXING AND DEMULTIPLEXING FRAME PAIRS

Using Hankel structured low-rank approximation for sparse signal recovery

An algebraic perspective on integer sparse recovery

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 2, FEBRUARY

of Orthogonal Matching Pursuit

On Sparsity, Redundancy and Quality of Frame Representations

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Acommon problem in signal processing is to estimate an

ESTIMATING an unknown probability density function

THE information capacity is one of the most important

Multipath Matching Pursuit

Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation

An iterative hard thresholding estimator for low rank matrix recovery

SPARSE signal representations have gained popularity in recent

Does Compressed Sensing have applications in Robust Statistics?

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

Trades in complex Hadamard matrices

Lecture: Introduction to Compressed Sensing Sparse Recovery Guarantees

INDUSTRIAL MATHEMATICS INSTITUTE. B.S. Kashin and V.N. Temlyakov. IMI Preprint Series. Department of Mathematics University of South Carolina

Uniform Boundedness of a Preconditioned Normal Matrix Used in Interior Point Methods

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

An Alternating Minimization Method for Sparse Channel Estimation

Signal Recovery from Permuted Observations

Constrained optimization

A new method on deterministic construction of the measurement matrix in compressed sensing

Performance Regions in Compressed Sensing from Noisy Measurements

SIMON FRASER UNIVERSITY School of Engineering Science

Design of Optimal Quantizers for Distributed Source Coding

Worst-Case Bounds for Gaussian Process Models

Optimum Sampling Vectors for Wiener Filter Noise Reduction

ECE521 week 3: 23/26 January 2017

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IN this paper, we consider the capacity of sticky channels, a

Computations of Critical Groups at a Degenerate Critical Point for Strongly Indefinite Functionals

Mathematical Methods for Data Analysis

Characterization of half-radial matrices

The Sparsest Solution of Underdetermined Linear System by l q minimization for 0 < q 1

ASIGNIFICANT research effort has been devoted to the. Optimal State Estimation for Stochastic Systems: An Information Theoretic Approach

WAVELET EXPANSIONS OF DISTRIBUTIONS

1462. Jacobi pseudo-spectral Galerkin method for second kind Volterra integro-differential equations with a weakly singular kernel

Cheng Soon Ong & Christian Walder. Canberra February June 2018

GREEDY SIGNAL RECOVERY REVIEW

Information-Theoretic Limits of Group Testing: Phase Transitions, Noisy Tests, and Partial Recovery

Orthogonal Matching Pursuit: A Brownian Motion Analysis

A Cross-Associative Neural Network for SVD of Nonsquared Data Matrix in Signal Processing

Continuous-Model Communication Complexity with Application in Distributed Resource Allocation in Wireless Ad hoc Networks

Strengthened Sobolev inequalities for a random subspace of functions

An Efficient Approach to Multivariate Nakagami-m Distribution Using Green s Matrix Approximation

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY Uplink Downlink Duality Via Minimax Duality. Wei Yu, Member, IEEE (1) (2)

arxiv: v4 [cs.it] 17 Oct 2015

THIS paper is aimed at designing efficient decoding algorithms

Geometric interpretation of signals: background

Linear Algebra Massoud Malek

DS-GA 1002 Lecture notes 10 November 23, Linear models

arxiv: v1 [cs.it] 21 Feb 2013

Decomposing Bent Functions

A Structured Construction of Optimal Measurement Matrix for Noiseless Compressed Sensing via Polarization of Analog Transmission

Operators with numerical range in a closed halfplane

Information-Preserving Transformations for Signal Parameter Estimation

Complex symmetric operators


Sampling of Periodic Signals: A Quantitative Error Analysis

SHANNON [34] showed that the capacity of a memoryless. A Coding Theorem for a Class of Stationary Channels With Feedback Young-Han Kim, Member, IEEE

Bernstein-Szegö Inequalities in Reproducing Kernel Hilbert Spaces ABSTRACT 1. INTRODUCTION

ONE approach to improving the performance of a quantizer

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm

Information-Theoretic Limits of Matrix Completion

On the Cross-Correlation of a p-ary m-sequence of Period p 2m 1 and Its Decimated

Chapter 3 Transformations

引用北海学園大学学園論集 (171): 11-24

Means of unitaries, conjugations, and the Friedrichs operator

Transcription:

Asymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling The Harvard community h made this article openly available. Plee share how this access benefits you. Your story matters Citation Babadi, Behth, Nichol Kalouptsidis, Vahid Tarokh. 2009. Asymptotic achievability of the Cramér Rao Bound for noisy compressive sampling. IEEE Transactions on Signal Processing 57(3): 1233-1236. Published Version http://dx.doi.org/10.1109/tsp.2008.2010379 Citable link http://nrs.harvard.edu/urn-3:hul.instrepos:2757546 Terms of Use This article w downloaded from Harvard University s DASH repository, is made available under the terms conditions applicable to Open Access Policy Articles, set forth at http:// nrs.harvard.edu/urn-3:hul.instrepos:dh.current.terms-ofuse#oap

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 3, MARCH 2009 1233 of the first summ. Taking (1) into account, we can distinguish two ces. 1) If is a RKHS with reproducing kernel, i.e.,, then is also a RKHS with reproducing kernel thus, (1) implies. Moreover, the convergence of the preceding series is absolute in. Hence, (11) holds. 2) If is a -space, then (1) does not necessarily imply the pointwise convergence of the series to. For this sertion to hold it is necessary to impose additional conditions on the process. For example, if is a smooth process 5 then, there exists a finite, non-negative meure on ( is the -algebra of Lebesgue meurable subsets of ) which is equivalent to the Lebesgue meure satisfies [12]. In this ce, we set. Moreover, it follows that, the series converges absolutely in. Therefore, (11) follows. ACKNOWLEDGMENT The authors would like to thank Dr. D. Cárden for his helpful comments. REFERENCES [1] E. Parzen, Extraction detection problems RKHSs, J. SIAM Control, vol. 1, no. 1, pp. 35 62, 1962. [2] T. Kailath, RKHS approach to detection estimation problems Part I: Deterministic signals in Gaussian noise, IEEE, Trans. Inf. Theory, vol. IT-17, no. 5, pp. 530 549, 1971. [3] T. Kailath H. L. Weinert, An RKHS approach to detection estimation problems Part II: Gaussian signal detection, IEEE, Trans. Inf. Theory, vol. IT-21, no. 1, pp. 15 23, 1975. [4] T. Kailath D. Duttweiler, An RKHS approach to detection estimation problems Part III: Generalized innovations representations a likelihood-ratio formula, IEEE, Trans. Inf. Theory, vol. IT-18, no. 6, pp. 730 745, 1972. [5] T. Kailath V. Poor, Detection of stochtic processes, IEEE, Trans. Inf. Theory, vol. 44, no. 6, pp. 2230 2259, 1998. [6] N. Aronszajn, Theory of reproducing kernels, Amer. Math. Soc. Trans., vol. 68, pp. 337 404, May 1950. [7] A. Berlinet C. Thom-Agnan, Reproducing Kernel Hilbert Spaces in Probability Statistics. Norwell, MA: Kluwer, 2004. [8] T. Kailath, R. T. Geesey, H. L. Weinert, Some relations among RKHS norms, Fredholm equations, innovations representations, IEEE, Trans. Inf. Theory, vol. IT-18, no. 3, pp. 341 348, 1972. [9] H. J. Weiner, The gradient iteration in time series analysis, Soc. Ind. Appl. Math. J., vol. 13, no. 4, pp. 1096 1101, 1965. [10] M. Chen, Z. Chen, G. Chen, Approximate Solutions of Operator Equations. Singapore: World Science, 1997. [11] M. N. Lukic J. H. Beder, Stochtic processes with sample paths in reproducing kernel hilbert spaces, Trans. Amer. Math. Soc., vol. 353, no. 10, pp. 3945 3969, 2001. [12] S. Cambanis, Representation of stochtic processes of second order linear operations, J. Math. Anal. Appl., vol. 41, no. 3, pp. 603 620, 1973. [13] H. Ren, Functional inverse regression reproducing kernel Hilbert spaces, Ph.D. dissertation, Tex A&M Univ., College Station, TX, 2005. [14] H. V. Poor, An Introduction to Signal Detection Estimation, 2nd ed. New York: Springer-Verlag, 1994. [15] K. K. Phoon, S. P. Huang, S. T. Quek, Implementation of Karhunen-Loève expansion for simulation using a wavelet-galerkin scheme, Probabilist. Eng. Mechan., vol. 17, no. 3, pp. 293 303, 2002. [16] J. F. Traub A. G. Werschulz, Complexity Information. Oxford, U.K.: Oxford Univ. Press, 1998. [17] R. Cools, Advances in multidimensional integration, J. Comput. Appl. Math., vol. 149, no. 1, pp. 1 12, 2002. [18] D. Bini L. Gemignani, Iteration schemes for the divide--conquer eigenvalue solver, Numer. Math., vol. 67, no. 4, pp. 403 425, 1994. 5 In the sense defined in [12]. For instance, a mean-square continuous process is smooth. [19] R. J. Barton H. V. Poor, On generalized signal-to-noise ratios in quadratic detection, Math. Control Signals Syst., vol. 5, no. 1, pp. 81 91, 1992. [20] E. Parzen, Probability density functionals reproducing kernel Hilbert spaces, in Proceedings of the Symposium on Time Series Analysis, M. Rosenblatt, Ed. New York: Wiley, 1963, pp. 155 169. Asymptotic Achievability of the Cramér Rao Bound for Noisy Compressive Sampling Behth Babadi, Nichol Kalouptsidis, Vahid Tarokh, Fellow, IEEE Abstract We consider a model of the form, is sparse with at most nonzero coefficients in unknown locations, is the observation vector, is the meurement matrix is the Gaussian noise. We develop a Cramér Rao bound on the mean squared estimation error of the nonzero elements of, corresponding to the genie-aided estimator (GAE) which is provided with the locations of the nonzero elements of. Intuitively, the mean squared estimation error of any estimator without the knowledge of the locations of the nonzero elements of is no less than that of the GAE. Assuming that is fixed, we establish the existence of an estimator that ymptotically achieves the Cramér Rao bound without any knowledge of the locations of the nonzero elements of, for a rom Gaussian matrix whose elements are drawn i.i.d. according to. Index Terms Compressive sampling, information theory, parameter estimation. I. INTRODUCTION We consider the problem of estimating a sparse vector bed on noisy observations. Suppose that we have a compressive sampling (Plee see [2] [5]) model of the form is the unknown sparse vector to be estimated, is the observation vector, is the Gaussian noise is the meurement matrix. Suppose that is sparse, i.e.,. Let be fixed numbers. The estimator must both estimate the locations the values of the non-zero elements of. If a Genie provides us with, the problem Manuscript received May 19, 2008; revised October 18, 2008. First published December 02, 2008; current version published February 13, 2009. The sociate editor coordinating the review of this manuscript approving it for publication w Prof. Athanios P. Liav. B. Babadi V. Tarokh are with the School of Engineering Applied Sciences, Harvard University, Cambridge, MA 02138 USA (e-mail: behth@se. harvard.edu; vahid@se.harvard.edu). N. Kalouptsidis is with the Department of Informatics Telecommunications, National Kapodistrian University of Athens, Athens 157 73, Greece (e-mail: kalou@di.uoa.gr). Digital Object Identifier 10.1109/TSP.2008.2010379 (1) 1053-587X/$25.00 2009 IEEE

1234 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 3, MARCH 2009 reduces to estimating the values of the non-zero elements of. We denote the estimator to this reduced problem by genie-aided estimator (GAE). Clearly, the mean-square estimation error (MSE) of any estimator is no less than that of the GAE (see [3]), since the GAE does not need to estimate the locations of the nonzero elements of ( bits, is the binary entropy function). Recently, Haupt Nowak [6] Cès Tao [3] have proposed estimators which achieve the estimation error of the GAE up to a factor of. In [6], a meurement matrix bed on Rademacher projections is constructed an iterative bound-optimization recovery procedure is proposed. Each step of the procedure requires operations the iterations are repeated until convergence is achieved. It h been shown that the estimator achieves the estimation error of the GAE up to a factor of. Cès Tao [3] have proposed an estimator bed on linear programming, namely the Dantzig Selector, which achieves the estimation error of the GAE up to a factor of, for Gaussian meurement matrices. The Dantzig Selector can be rect a linear program can be efficiently solved by the well-known primal-dual interior point methods, suggested in [3]. Each iteration requires solving an system of linear equations the iterations are repeated until convergence is attained. In this correspondence, we construct an estimator bed on Shannon theory the notion of typicality [4] that ymptotically achieves the Cramér Rao bound on the estimation error of the GAE without the knowledge of the locations of the nonzero elements of, for Gaussian meurement matrices. Although the estimator presented in this correspondence h higher complexity (exponential) compared to the estimators in [3] [6], to the best of our knowledge it is the first result establishing the achievability of the Cramér Rao bound for noisy compressive sampling. The problem of finding efficient low-complexity estimators that achieve the Cramér Rao bound for noisy compressive sampling still remains open. The outline of this correspondence follows next. In Section II, we state the main result of this correspondence present its proof in Section III. We then discuss the implications of our results in Section IV. II. MAIN RESULT The main result of this correspondence is the following: Main Theorem: In the compressive sampling model of, let be a meurement matrix whose elements are i.i.d. distributed according to. Let be the Cramér Rao bound on the mean squared error of the GAE (averaged over ), be a fixed number. If grows polynomially in, suming that the locations of the nonzero elements of are not known, there exists an estimator (namely, the joint typicality estimator, given explicitly in this correspondence) for the nonzero elements of with mean-square error, such that Lemma 3.1: Let be a meurement matrix whose elements are i.i.d. distributed according to, such that be the submatrix of with columns corresponding to the index set. Then, with probability 1. Proof: Let be any two columns of. The law of large numbers implies. Thus, the columns of are mutually orthogonal with probability 1, which proves the statement of the Lemma. We can now define the notion of Joint Typicality. We adopt the definition from [1]: Definition 3.2: We say an noisy observation vector, a set of indices, with, are -jointly typical if is the submatrix of with columns corresponding to the index set. We denote -jointly typicality of with by. Note that we can make the sumption of without loss of generality, bed on Lemma 3.1. Definition 3.3 (Joint Typicality Estimator): The Joint Typicality Estimator finds a set of indices which is -jointly typical with, by projecting onto all the possible -dimensional subspaces spanned by the columns of choosing the one satisfying (3). It then produces the estimate by projecting onto the subspace spanned by : If the estimator does not find any set -typical to, it will output the zero vector the estimate. We denote this event by. We show that the Joint Typicality Estimator h the property stated in the Main Theorem. In order to prove this claim, we first establish the following Lemm: Lemma 3.4: For any unbied estimator of, is the submatrix of with columns corresponding to the index set. Proof: Assuming that a Genie provides us with, we have is the subvector of with elements corresponding to the index set. The Fisher information matrix is then given by Therefore, for any unbied estimator [7], (2) (3) (4) (5) (6) (7) by the Cramér Rao bound (8) III. PROOF OF THE MAIN THEOREM In order to establish the Main Result, we need to define the Joint Typicality Estimator. We first state the following Lemma for rom matrices: We note that Lemma 3.4 is also presented in [3]. Lemma 3.5: Let be an matrix whose elements are i.i.d. distributed according to. Then (9)

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 3, MARCH 2009 1235 with fixed. Proof: Let. We have follows from the union bound. Taking the term corresponding to of the summation, we can rewrite (17) out (10) Since the elements of are i.i.d. normally distributed, we can invoke the law of large numbers follows: (11) (18) if (12) The first term on the right-h side of (18) can be upper-bounded We note that converges to. Therefore (13) is non-singular with probability 1, since it element-wise with probability 1. Thus, the limit exists with probability 1. Hence (14). Taking the trace of both sides of the above equation, along with the linearity of the trace operator, proves the statement of the Lemma. Lemma 3.6 (Lemma 3.3 of [1]): For any, (15) (19) the inequality follows from Lemma 3.6. This term clearly tends to zero, since grows polynomially in by sumption. Using Lemma 3.5, the second term on the right-h side of (18) can be expressed follows: (20) Finally, the third term on the right-h side of (18) can be upperbounded (16) is an index set such that Proof: Plee refer to [1] for the proof. Proof of the Main Theorem: We can upper-bound the MSE of the Joint Typicality Estimator, averaged over all possible Gaussian meurement matrices, follows: (21) denotes the th component of. The first inequality follows from a trivial upper bound of on the error variance, the second inequality follows from Lemma 3.6. The number of index sets such that is upper-bounded by. Also,. Therefore, we can rewrite the right-h side of (21) (17) denotes the event probability defined over the noise density, denotes the Gaussian probability meure the inequality (22)

1236 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 57, NO. 3, MARCH 2009 We use the inequality in order to upper-bound the th term of the summation in (22) by (23) (24). Hence, the exponent in (26) will grow to, long grows polynomially in. Thus, the MSE of the Joint Typicality Estimator is ymptotically upper-bounded by (29) By Lemma 3.5, we know that this is the ymptotic Cramér Rao bound on the mean squared estimation error of the GAE, which proves the statement of the main theorem.. We define (25) By Lemm 3.4, 3.5 3.6 of [1], it can be shown that ymptotically attains its maximum at either or, if. Thus, we can upper-bound the right-h side of (21) by IV. DISCUSSION OF THE MAIN RESULT We consider the problem of estimating a sparse vector,, using noisy observations. Let. The estimator must both estimate the locations the values of the non-zero elements of. Clearly, the mean squared estimation error of any estimator is no less than that of the genie-aided estimator, for which the locations of the nonzero elements of are known. We have constructed a Joint Typicality Estimator that ymptotically achieves the Cramér Rao bound on the mean squared estimation error of the GAE without any knowledge of the locations of nonzero elements of, for a fixed number. REFERENCES It is straightforward to obtain Since (26) (27) (28) due to the sumption of, both will grow to linearly [1] M. Akçakaya V. Tarokh, Shannon theoretic limits on noisy compressive sampling, IEEE Trans. Inf. Theory [Online]. Available: www. arxiv.org, submitted for publication [2] E. Cès T. Tao, Decoding by linear programming, IEEE Trans. Inf. Theory, vol. 51, no. 12, pp. 4203 4215, Dec. 2005. [3] E. Cès T. Tao, The Dantzig selector: Statistical estimation when p is much larger than n, Ann. Statist., pp. 2313 2351, Dec. 2007. [4] T. M. Cover J. M. Thom, Elements of Information Theory, 2nd ed. New York: Wiley, 2006. [5] D. Donoho, Compressed sensing, IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289 1306, Apr. 2006. [6] J. Haupt R. Nowak, Signal reconstruction from noisy rom projections, IEEE Trans. Inf. Theory, vol. 52, no. 9, Sep. 2006. [7] H. L. Van Trees, Detection, Estimation, Modulation Theory, Part I, 1st ed. New York: Wiley, 2001.