4284 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007

Size: px

Start display at page:

Download "4284 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007"

Angelina Davis
5 years ago
Views:

1 4284 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 Distributed Estimation Using Reduced- Dimensionality Sensor Observations Ioannis D. Schizas, Student Member, IEEE, Georgios B. Giannakis, Fellow, IEEE, Zhi-Quan Luo, Fellow, IEEE Abstract We derive linear estimators of stationary rom signals based on reduced-dimensionality observations collected at distributed sensors communicated to a fusion center over wireless links. Dimensionality reduction compresses sensor data to meet low-power bwidth constraints, while linearity in compression estimation are well motivated by the limited computing capabilities wireless sensor networks are envisioned to operate with, by the desire to estimate rom signals from observations with unknown probability density functions. In the absence of fading fusion center noise (ideal links), we cast this intertwined compression-estimation problem in a canonical correlation analysis framework derive closed-m mean-square error (MSE) optimal estimators along with coordinate descent suboptimal alternatives that guarantee convergence at least to a stationary point. Likewise, we develop estimators based on reduced-dimensionality sensor observations in the presence of fading additive noise at the fusion center (nonideal links). Permance analysis corroborating simulations demonstrate the merits of the novel distributed estimators relative to existing alternatives. Index Terms Canonical correlation analysis (CCA), distributed compression, distributed estimation, nonlinear optimization, wireless sensor networks (WSNs). I. INTRODUCTION WITH the popularity of battery-powered wireless sensor networks (WSNs), distributed estimation relying on sensor data processed at a fusion center (FC) has attracted increasing interest recently. Constrained by limited power bwidth resources, existing approaches either take advantage of spatial correlations across sensor data to reduce transmission requirements [2], [5], [11], [15], [16], or, rely on severely quantized (possibly down to one bit) digital WSN data to m distributed estimators of deterministic parameters, see, e.g., Manuscript received November 6, 2005; revised August 10, The associate editor coordinating the review of this manuscript approving it publication was Prof. Javier Garcia-Frias. Prepared through collaborative participation in the Communications Networks Consortium sponsored by the U. S. Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement DAAD The work of Z.-Q. Luo is supported by the U.S. DoD Army, grant number W911NF The U.S. Government is authorized to reproduce distribute reprints Government purposes notwithsting any copyright notation thereon. Parts of the paper were presented at the Thirty-Ninth Asilomar Conference, Pacific Grove, CA, Oct. 30 November 2, 2005, at the 2006 International Conference on Acoustics, Speech Signal Processing (ICASSP), Toulouse, France, May 14 19, The authors are with the Department of Electrical Computer Engineering, University of Minnesota, Minneapolis, MN USA ( schizas@ece. umn.edu; georgios@ece.umn.edu; luozq@ece.umn.edu). Color versions of one or more of the figures in this paper are available online at Digital Object Identifier /TSP [8], [10], [12], references therein. Distributed estimation of rom signals has also been considered by [5], [9], [14] [16], but results are restricted by one or more of the following assumptions: i) Gaussian signals /or sensor data; ii) linear sensor observation models; iii) ideal links; i.e., absence of fading in the sensor-fc channels /or additive noise at the FC. Overcoming limitations i) ii), our goal in this paper is to m estimates at the FC of a rom stationary vector based on analog-amplitude multisensor observations. To enable estimation under the stringent power computing limitations of WSNs develop methods that do not require knowledge of the sensor data probability density function (pdf), which, in a number of cases, may not be available, we seek linear dimensionality reducing operators (data compressing matrices) per sensor along with linear operators at the FC in order to minimize the mean-square error (MSE) in estimation. We treat first the ideal channel case, where we mulate this intertwined compression-estimation task as a canonical correlation analysis (CCA) problem. CCA is a well-documented tool data model reduction problems encountered in various applications such as statistical data analysis, control, signal processing, to name a few [4, Ch. 10]. But our contribution here is to demonstrate that CCA provides a natural framework estimating rom signals based on reduced-dimensionality WSN observations. The resultant estimators apply to possibly nonlinear non-gaussian setups can be generalized to incorporate channel fading as well as FC noise effects, which necessitate tackling distributed CCA problems under a prescribed power budget per sensor. Specifically, we establish that with either decoupled or coupled multisensor observations communicated to the FC through ideal links, the problem mulation (Section II) lends itself naturally to CCA. In the decoupled case, we prove that the optimal solution amounts to compressing, via principal components analysis (PCA), the linear minimum mean-square error (LMMSE) signal estimate med at each sensor (Section III). We further compare the MSE of this estimate-first compress-afterwards approach with suboptimal compress-first estimate-afterwards alternatives, including the scheme in [16]. With coupled (i.e., correlated) sensor data, optimal distributed estimation has been shown to be NP-hard when reduced-dimensionality sensor data are concatenated at the FC [9]. Interestingly, we establish that when the same data are superimposed at the FC, the CCA-based approach can provide closed-m solutions with low-order data reduction (Section IV-A). But since lower MSE estimates result when concatenating (rather than superimposing) compressed WSN X/$ IEEE

2 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4285 Fig. 1. Distributed setup estimating a rom vector signal s. data, we also develop a coordinate descent iterative estimator which always converges to a stationary point (Section IV-B). This distributed estimator subsumes a recent distributed reconstruction algorithm derived Gaussian sources in [5], shows as a by-product that [5] applies also to non-gaussian signal reconstruction. Through numerical examples, we test compare these schemes in Section IV-C. In nonideal channel links, we also derive closed-m MSE optimal estimators decoupled sensor data (Section V-A) establish that fading channels additive noise at the FC do not affect optimality of the estimate-first compress-afterwards approach. For correlated sensor observations, we further develop a coordinate descent distributed estimation algorithm with guaranteed convergence at least to a stationary point (Section V-B). Our findings in Sections V-A V-B are corroborated by numerical examples (Section V-C). We conclude this paper in Section VI. II. PROBLEM STATEMENT Consider the WSN depicted in Fig. 1, comprising sensors linked with an FC. Each sensor, say the th one, observes an vector that is correlated with a rom signal of interest. Similar to [5] [14] [16], we assume the following: a1) no inmation is exchanged among sensors their channel links with the FC are ideal; a2) data the signal are zero-mean with full rank auto- cross-covariance matrices,, all of which are available at the FC. For a1) to hold, one needs sufficiently powerful error control codes based on which, without loss of generality (w.l.o.g.), Shannon s separation principle allows one asymptotically (in the code length) to isolate source from channel coding. This means that errors arising due to nonideal links will be mitigated via error control coding. With finite-length codes or without relying on error control coding altogether, we will pursue a separate treatment nonideal channels in Section V. The zeromean assumption in a2) does not sacrifice generality either but is made simplicity in exposition. A priori knowledge of the covariances in a2) can come either from specific data models, or, after sample estimation during a training phase. Notice that unlike [5], [9], [16], we neither confine ourselves to a linear signal-plus-noise model, nor do we invoke any assumption on the distribution (e.g., Gaussianity) of. Through a fat matrix, each sensor transmits a compressed vector, based on which the FC ms (through a matrix ) a linear estimate of. The entries of each vector are transmitted using, e.g., multicarrier modulation with one entry riding per subcarrier. Low-power bwidth constraints at the sensors encourage transmissions with, while linearity in compression estimation are well motivated by low-complexity requirements. Notice that optimal nonlinear estimation compression require knowledge of the sensor data pdf which is typically not available when non-gaussianity /or nonlinearity is present in the data. However, linear compression estimation schemes rely only on second-order statistics (auto- cross-covariance matrices) that can be readily estimated through sample averaging. As far as multiple access of compressed vectors of different sensors, we consider two scenarios: S1) In the first scenario, sensors m reduced dimensionality vectors of identical size, which are superimposed coherently at the FC as S2) In the second scenario, sensors transmit over orthogonal channels so that the FC separates concatenates compressed vectors of individual sensors to m the vector where, denotes a block diagonal matrix, is the vector of concatenated sensor observations ( sts transposition). For both scenarios, our problem is to obtain under a1) a2) MSE optimal matrices ; i.e., we seek where is given by either (1) or (2), depending on the operational scenario. While solving (3) is our first goal with ideal sensor-fc links, later in Section V we will solve (3) when is affected by fading additive noise at the FC. With either ideal or nonideal links, the FC will compute communicate to the sensors them to m. This communication takes place during the start-up phase or whenever the data (cross-)correlations change. The optimal matrix is used by the FC to m the estimate. III. DECOUPLED MULTISENSOR ESTIMATION Let us examine first the case,, which shows up, e.g., when matrices in the linear model are mutually uncorrelated (or orthogonal) also uncorrelated with the noise vectors. Using transmission scenario S2), our multisensor optimization task in (3) reduces to a set of (1) (2) (3)

3 4286 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 decoupled problems. Specifically, we show in Section A of the Appendix that the cost function in (3) becomes where represents the columns of matrix with indexes through,. Notice, that each non-negative summ in (4), say the th one, is a function of only can be viewed as the MSE estimating based on single-sensor data, compressed via, communicated to the FC linear estimation using. It follows that solving (3) in this decoupled case, it suffices to solve times. As we will see, (5) can be solved in closed m. The optimal solution will be instrumental not only this section s decoupled setup but also the coupled one, where. Focusing on the minimization problem of (5), let us drop subscripts brevity, write,, with sizes,,, respectively. For future use, we will denote with the orthonormal matrix med by the eigenvectors corresponding to the largest eigenvalues of ; likewise corresponding to those of. In such a setup, one might be tempted to carry out (5) in the following two steps: s1) compress first at the sensor using principal component analysis (PCA) implemented with the Karhunen Loève transm (KLT) as ; reconstruct it at the FC as ; s2) find at the FC the LMMSE estimator of based on, or, equivalently on as. The aggregate MSE of this two-step approach, which we henceth term compress-estimate (CE), is given by where is the well-known LMMSE without compression, which corresponds to [7, p. 389]; is a diagonal matrix containing the smallest eigenvalues of, contains the corresponding eigenvectors. Interestingly, even though each of the CE steps s1) s2) is MSE-optimal, is not always the minimum possible. In fact, the optimal solution per sensor becomes available if we view (5) as a CCA problem which we obtain readily from [4, p. 368] Theorem 1. (4) (5) (6) Theorem 1: For, the optimal matrices at the sensor at the FC that minimize, under a1) a2), are (7) where is any invertible matrix. The minimum MSE (MMSE) is given by where are the smallest eigenvalues of the matrix. The solution in (7) is clearly not unique. To render it unique, we can set equal to the identity matrix. In this case, it follows by inspection that the last two matrix factors of in (7) comprise the LMMSE operator estimating based on ; while the first one,, implements KLT compression. Along with, the matrix applies PCA on the LMMSE estimate med at the sensor. In other words, CCA offers the optimal solution of (5), which can be viewed also as an estimate-first compress-afterwards process at the sensor, followed by PCA reconstruction at the FC. Naturally abbreviating this optimal two-step process as estimate-compress 1 (EC), contrasting it with the CE one, we deduce the following. Corollary 1: For, a class of optimal solutions of (5) under a1) a2) is provided by (7). With, a unique optimum is attained by an EC scheme with corresponding MMSE. Clearly, there is no dimensionality reduction in the EC scheme when, the CCA returns the LMMSE estimate at the FC with MMSE given by. The fact that CE perms worse than EC is intuitively reasonable since CE compresses, taking into account only the covariance matrix without suppressing sensor noise, which compromises MSE permance of the estimation step at the FC. On the contrary, in EC, the LMMSE first extracts from all the inmation pertinent to estimating then perms compression. In that way, EC suppresses a significant part of the noise present in the sensor data. Notice that when, CCA boils down to PCA, theree, CCA-based estimation subsumes as a special case PCA-based reconstruction, which was dealt with in [5]. Also, if, then. This suggests that we should never use, since the MMSE is, can never decrease below, which is achieved with uncompressed data. Remark 1 (Comparison With [16]): Assuming a tiori an LMMSE estimate at the FC relying on a different cost function during the compression step, [16] derived a compress-estimate scheme that we will henceth denote as. Instead of used by CE, the optimum compression matrix 1 The term compress throughout this paper refers to dimensionality reduction not quantization which outputs bits digital transmission. We prefer estimate-compress over e.g., estimate-reduce since dimensionality reduction constitutes the first module of practical quantizers anyway, even though alone reduction lends itself to analog transmission. Quantization with a finite bit rate is not considered here, but pertinent MSE distortion-rate analysis distributed estimation can be found in [13]. (8)

4 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4287 Fig. 2. (a) MMSE versus k comparison of EC, CE, C E estimators s colored (L = 1); (b) MMSE versus k comparison of Algorithm 1 against alternative schemes distributed estimation (L =2). in, is med by the eigenvectors corresponding to the smallest eigenvalues of. Estimation in is the same as in CE. Since we have already established the joint MSE-optimality of EC in Corollary 1, albeit MSE-optimal in the estimation step, this scheme is overall suboptimal. How compares with CE EC will be tested next. Test Case 1 (Linear Model EC/CE Comparison With [16]): To corroborate the suboptimality of CE relative to EC compare both with, we considered the linear model, where denotes an deterministic matrix, is white noise (i.e., ) uncorrelated with, we select,,, matrix, where the entries of are drawn romly from a stardized normal. Furthermore, the signal-to-noise ratio (SNR) associated with the linear model is set to SNR. Fig. 2(a) shows that,wehave, while the CE reaches the lowest bound attainable by uncompressed observations. Fig. 2(a) demonstrates also that the scheme is suboptimal with,. Finally, we observe that when, the MMSE of exceeds that of the CE approach; i.e.,. Even though the MMSE gap between CE EC can be larger non-gaussian nonlinear models, it is interesting to note that, under special conditions, CE coincides with the optimal EC, in certain cases, it can even reach the lowest bound. To this end, we prove the following in Sections B C of the Appendix. Corollary 2: If a1) a2) hold true, with white uncorrelated with, then the following is found: 2A) also white, CE is MSE optimal, i.e.,, ; 2B) even with nonwhite, CE is also MSE optimal, i.e.,,. Corollary 2A) provides a setup where CE coincides with the optimal EC. The implication of Corollary 2B) is that one should not pursue dimensionality reduction of order, since can be attained by projecting onto a subspace with lower dimension equal to [see also Fig. 2(a)]. Remark 2 (Comparison With [14] [15]): Two MSE-optimal distributed estimators related to our CCA-based solution in (7) have been reported, originally in [15] recently in [14]. Using the LMMSE matrix estimator, both express matrix at the FC in terms of minimize the resultant MSE,, to determine the corresponding optimum compression matrices. The nonuniqueness of MSE-optimal compression matrices in the single-sensor case (mentioned also in [15, Lemma 3.2]) is manifested in the matrix of Theorem 1 explains why the solutions in [14], [15] are different from in (7). Besides this nonuniqueness, the novel application of CCA to distributed estimation in Theorem 1 reveals clearly what is unique in this context, namely the product. The CCA-based framework provides also valuable interpretations of the optimal solution at the local level (EC per sensor) central unit (reconstruction at the FC). Furthermore, having the EC scheme as a reference it is possible to appreciate the suboptimality of CE schemes such as the one in [16]. Finally, as we will see in Section IV-A, Theorem 1 suggests readily a closed-m solution even in the nondecoupled distributed estimation problem where data correlated across sensors are coherently superimposed at the FC a setup not considered in [14] [15]. IV. JOINT MULTISENSOR DISTRIBUTED ESTIMATION Here, we suppose that at least one pair with derive two distributed estimators [corresponding to scenarios S1) S2)] with complementary strengths. The estimator S1) will be simpler to compute, as it will turn out in closed m; but the one S2) will afd a lower MSE since identical reduced-dimensionality order per sensor, it will lead to an estimator based on data from a compressed subspace of higher dimensionality.

5 4288 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 A. Scenario 1: Superimposing Compressed Observations Recall that under S1), the FC receives the coherent combination of all sensors compressed vectors of common dimension,. Upon substituting (1) into (3), we have the estimate of at the FC whose auto- cross-covariances are given by (12) (13) (14) But the last equality shows that (3) under S1) can be optimized using the CCA approach of Theorem 1. Hence, we have established Theorem 2. Theorem 2: Under S1), if a1) a2) are satisfied, the optimal (in the sense of (3)) matrix at the FC is given by (7), while the optimal compressing matrix the th sensor is med by the columns with indexes through of the matrix in (7). The corresponding MMSE is also given by (8). Theorem 2 asserts that the closed-m solution of Theorem 1 with uncorrelated data across sensors carries over to the correlated case provided that compressed sensor data are superimposed as in S1) can be interpreted as a joint EC scheme. Furthermore, properties MSE permance of CE alternatives we discussed in Section III apply to this setup as well. For example, when ( thus ), the resultant LMMSE estimator boils down to a joint PCA-based reconstruction approach. B. Scenario 2: Concatenating Compressed Observations The motivation behind scenario S2) is threefold: i) it allows the compression order to be sensor dependent; ii) it relaxes the synchronization requirements of S1;, more important, iii) through concatenation, it provides the FC with a subspace of compressed data having the maximum possible dimension, based on which the signal estimator can attain a lower MSE than that obtained under S1), where. For the -sensor setup under S2), we can explicitly write the MSE cost as [c.f. (2) (3)] where is the submatrix of at the FC, is the compressing matrix at sensor. Since optimizing (9) over these matrices does not lead to a closed-m solution (in general, it incurs complexity that grows exponentially with [9]), we will explore means of reducing the number of optimization variables. This reduction will lead us to a coordinate descent algorithm minimizing (9). To this end, let us define,,, ; the auxiliary vectors (9) (10) (11) Using these definitions supposing that in Section D of the Appendix that is given, we prove (15) There are two points worth stressing about in (15): i) it is in the CCA m solvable by Theorem 1; ii) vectors as well as their covariance matrices depend only on. These observations establish the following result: Theorem 3: Under S2), if a1) a2) are satisfied,, then a given the optimal matrices minimizing are (16) (17) (18) where are the eigenvectors corresponding to the largest eigenvalues of the matrix. The MMSE is also given by (19) If the given is the optimal one, then Theorem 3 will return the globally optimum solution of (9) under scenario S2). Otherwise, we can determine appropriate matrices, which achieve a stationary point of (9), via the following alternating algorithm that stems directly from Theorem 3. Algorithm 1: Initialize romly the matrices end Given determine via (16). Given, determine via (17) (18). If MSE MSE a prescribed tolerance, then stop. end,

6 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4289 Notice, that MSE is a nonincreasing function of, since Algorithm 1 belongs to the family of coordinate descent iterative schemes, where at every step during the th iteration it yields the global minimum of (9), given. The latter ensures that Algorithm 1 will converge at least to a stationary point of the cost in (9) [1, p. 273]. It is important to emphasize that Theorem 3 as well as the resultant iterative algorithm determining the matrices required at the sensors the FC apply to generally nonlinear data models regardless of the underlying distribution of. This is to be contrasted with [16] [9], which adopt a linear data model. Notice, that Remark 2 also applies to the relationship of Algorithm 1 with its counterpart in [15]. Furthermore, since PCA is a special case of CCA, the novelty of our result relative to the distributed reconstruction algorithm of [5] is twofold: i) it applies to general distributed estimation problems ii) even reconstruction, there is no need to assume that are jointly Gaussian. As we will also confirm by numerical examples, the estimator using Algorithm 1 in S2) leads to a lower MMSE than the one in S1). The reason is that the estimation scheme in S1) relies on superimposed compressed data which span a -dimensional subspace, whereas the one in S2) uses concatenated compressed data lying in a -dimensional subspace. C. Numerical Results Comparisons We now present numerical results to compare (in terms of MMSE) the estimators of this section against existing alternatives. Test Case 2 (Comparing S1, S2, [16]): Here, we use the linear data model with full rank matrix,, chosen uncorrelated white. Fig. 2(b) depicts the MMSE of the estimators in S1), S2), [16] with parameters,, SNR sensors. For S1), both the optimal EC as well as CE are tested with, whereas the matrices needed the distributed estimator in S2) are obtained using Algorithm 1 with tolerance choosing those that achieve the smallest MSE, among all the possible pairs which satisfy the equality. It is evident that the estimator based on Algorithm 1 yields smaller MMSE than the EC CE schemes in S1), both of which outperm the estimator of [16]. We also test the S1) S2) estimators with sensors parameters,,. Fig. 3(a) confirms that S2) outperms S1) only when, the S1) estimator catches up with Algorithm 1 in S2). However, the price paid is an increase in the amount of transmitted inmation at least by a factor of 3.5. Test Case 3 [Benchmarking S2) S1)]: Using the setup of test case 2, we check how close Algorithm 1 perms relative to fundamental MSE limits. We plot in Fig. 3(b) its MMSE along with the nonachievable lower bound on MSE obtained when the entire vector is available at a single sensor. We observe that Algorithm 1 comes surprisingly close to this centralized setup, where we have applied the single-sensor EC CE schemes. The same figure also illustrates that the S2) estimator outperms the decoupled one in Section III, where we ignore sensor correlations determine the MSE optimal matrices Fig. 3. MMSE versus k with L =5 sensors: (a) comparison of Algorithm 1 S2 against EC S1; (b) comparison of Algorithm 1 S2 with upper lower MMSE bounds distributed estimation. as in (5). by solving independent minimization problems V. CHANNEL AWARE DISTRIBUTED ESTIMATION Similar to [5] [14] [16], we have so far dealt with ideal sensor FC links. In this section, we consider nonideal channels which include multiplicative fading additive noise effects. Motivation is provided by applications where error control coding is not a choice, or, its length is insufficient a1) to be satisfied. For such cases, we replace a1) with the following: a1 ) Each sensor-fc link comprises a full rank fading multiplicative channel matrix along with zero-mean additive FC noise, which is uncorrelated with,, across channels; i.e., noise covariance matrices satisfy. Matrices are available at the FC. In multicarrier links we alluded to in Section II, full rank of the channel matrices is ensured if sensors do not

7 4290 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 transmit over subcarriers with zero channel gain. In either multicarrier or single-carrier links, the (not necessarily diagonal) matrices can be acquired via training,, likewise the noise covariances, they can be estimated via sample averaging as usual. With multicarrier ( generally any orthogonal) sensor access, noise uncorrelatedness across channels is also well justified. Because S2) can reach a lower MMSE than S1), we will henceth focus on S2), where the FC concatenates to m the vector [c.f. (2)] (20) A distinct challenging feature with nonideal links is the need a power constraint per sensor. Indeed, any finite channel norm noise variance, if transmit power were not constrained, we could always ensure that the ideal link assumption a1) is satisfied by scaling the optimal with an arbitrarily large factor correspondingly multiplying by,as per Theorem 1. In other words, when a1 ) is in effect, we deal with the following constrained optimization problem [c.f. (3)]: s.t. (21) where denotes the maximum power that sensor can afd. A. Decoupled Observations Similar to Section IV, we consider first estimation with decoupled (uncorrelated) sensor data, where,. As bee, this renders our joint -sensor optimization equivalent to single-sensor problems. Indeed, under scenario S2), the cost in (21) can be written as (22) where is the submatrix of. As the th non-negative summ again depends only on, the MSE optimal matrices are s.t. (23) Since the cost in (23) corresponds to a single-sensor setup, we will drop the subscript notational brevity write,,,,. With this simplification, the Lagrangian the minimization in (23) can be written as ; or, after exping,as where is the corresponding Lagrange multiplier. Next, we derive a simplified m of (24), the minimization of which will provide closed-m solutions the MSE optimal matrices. Aiming at this simplification, consider the SVD, the eigendecompositions, where. Notice, that captures the SNR of the th entry in the received signal vector at the FC. Further, define with, with corresponding eigendecomposition, where. Moreover, let denote the invertible matrix which simultaneously diagonalizes the matrices thus satisfies (25) Because (,,,,,,, ) are all invertible matrices, every matrix (or ), we can clearly find a unique matrix (correspondingly ) that satisfies (26) where have sizes, respectively. Using (26), we show in Section E of the Appendix that the Lagrangian in (24) simplifies to (27) Differentiating with respect to, setting the result to zero, we obtain, respectively (28) (29) Based on (28) (29), we prove in Section F of the Appendix the following property. Property 1: If contains distinct diagonal entries, then are diagonal. Substituting from (28) into (27) using Property 1, the Lagrangian becomes a function of as follows: (24) (30)

8 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4291 Applying the well known Karush Kuhn Tucker (KKT) conditions (see, e.g., [3, Ch. 5]) that must be satisfied at the minimum of (30), we establish Property 2 in Section G of the Appendix. Property 2: The optimal matrix minimizing (30) contains at most one nonzero entry in each row column. Furthermore, the nonzero entries of are located within its first columns. A by-product in the proof of Property 2 is that when, the MMSE remains invariant; thus, it suffices to consider. Restricting now w.l.o.g. the search the optimal within the class of matrices satisfying Property 2 rewriting as, where are permutation matrices is a diagonal matrix, we can remulate (30) as (31) where is the index of the entry equal to unity in the th column of. Similarly, is the index of the unity entry in the th row of. Upon using the Lagrangian in (31), we end up with a simpler minimization problem, as follows: are:, subject to, (35) where the th diagonal entry is provided by (33), the corresponding Lagrange multiplier is specified by (34). The MMSE is given by [c.f. (31)] (36) According to Theorem 4, the optimal weight matrix in distributes the given power across the entries of the prewhitened vector at the sensor in a waterfilling-like manner so as to balance channel strength additive noise variance at the fusion center with the degree of dimensionality reduction that can be afded. (As, it is possible that less than the prescribed entries are transmitted to minimize the MSE cost.) It is also worth mentioning that (33) dictates a minimum power per sensor. Indeed, in order to ensure that, we must have, which implies that the power must satisfy s.t. (32) Using once more the KKT conditions in (32), we show in Section H of the Appendix that without affecting optimality of we can have ; hence, a diagonal is optimal with diagonal entries (33) where is the maximum integer in which are strictly positive, or, ; is found after plugging (33) into the power constraint to obtain (34) Summarizing, we have established the following main result nonideal links. Theorem 4: Under a1 ), a2),, the matrices minimizing (37) Besides distributed estimation with reduced-dimensionality decoupled observations, Theorem 4 is valuable all cases CCA ( thus PCA) applies in the presence of multiplicative fading additive noise. Although we arrived at (33) (36) based on Property 1, which requires to be distinct, Theorem 4 holds even with repeated eigenvalues since the diagonal entries of are continuous functions of, thus the optimal solution has to be the same any matrix. Another interesting issue is whether the optimal matrices in (35) can be viewed as implementing an stimate-compress scheme which, by analogy to the ideal setup, could enjoy the optimality asserted by Theorem 4 in the presence of noise fading. It will turn out that such a scheme, which we will abbreviate as EC-n since it pertains to the nonideal case, is possible as with the EC following from Theorem 1, it operates in two steps: s1) m at the sensor the LMMSE estimate of given, as ; s2) determine the optimal compressing estimation matrices via Theorem 4 after replacing with. Formally, we show in the Appendix the following. Corollary 3: For, the matrix in (35) can be written as, where is the optimal matrix obtained by Theorem 4 when. Thus, the EC scheme in the presence of noise fading is MSE optimal in the sense of minimizing (21),. It is worth stressing at this point that besides CCA, Theorem 4 with solves the PCA-based distributed reconstruction

9 4292 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 problem in the presence of multiplicative additive noise. Beyond signal processing communications where such noise effects are introduced by the channel, this is very important in a number of other areas (e.g., pattern recognition, controls statistics) whenever the available data whose dimensionality is to be reduced are imperfect. Remark 4: In contrast to Section III, it is not clear whether in the presence of fading noise the MMSE in (36) decreases as increases, given a limited power budget per sensor. To assess how the MSE behaves with increasing under a fixed power, let, be the optimal compression matrices found as in Theorem 4.In order to have a fair comparison between the matrices should be determined the same system setup, which implies that the corresponding MSE costs must have common,. Under this condition, it turns out that in the presence of noise fading, the MMSE in (36) is still nonincreasing with. Specifically, we prove in Section J of the Appendix the following. Corollary 4: If are the optimal matrices determined by Theorem 4 with, under the same channel parameters, common power, the MMSE in (36) is a nonincreasing function of, i.e.,. B. Coupled Observations In this section, we still consider sensors transmitting as in the S2) scenario over nonideal links, but we allow their observations to be correlated. Because is no longer block diagonal, decoupling of the multisensor optimization problem cannot be effected in this case. The pertinent MSE cost can be written as [c.f. (21)] (38) As with ideal links, a closed-m solution minimizing (38) subject to a power constraint per sensor does not seem possible; see also [9]. For this reason, we will resort to iterative alternatives that converge to at least a stationary point of the cost in (38). To this end, let us suppose temporarily that matrices are fixed satisfy the power constraints,. Upon defining the vector, the cost in (38) becomes (39) which being a function of only, falls under the realm of Theorem 4. This means that when are given, the matrices minimizing (39) under the power constraint can be directly obtained from (35), after setting, in Theorem 4. The corresponding auto- cross-covariance matrices needed are (40) (41) We have thus established the following result coupled sensor observations. Theorem 5: Under S2), if a1 ) a2) are satisfied,, then given matrices satisfying, the optimal matrices minimizing are provided by Theorem 4, after setting, applying the covariance modifications in (40) (41). As with ideal links, Theorem 5 suggests the following iterative estimator distributed estimation in the presence of fading FC noise. Algorithm 2: Initialize romly the matrices, such that. Given end, determine using Theorem 5. end If MSE MSE a prescribed tolerance, then stop. Algorithm 2 is a block coordinate descent algorithm, which, thanks to Theorem 4, ensures that the MSE cost per iteration is nonincreasing thus convergence is guaranteed at least to a stationary point of (38). Beyond its applicability to possibly non-gaussian nonlinear model settings, it is the only available algorithm hling fading generally colored noise effects in distributed estimation ( general CCA) problems with reduced-dimensionality observations. C. Numerical Results Comparisons Here, we first evaluate numerically the EC-n scheme in the decoupled case compare it with the Algorithm 2 we derived correlated sensor data. As bee, we will test MMSE permance versus, but also versus the ratio of transmit power

10 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4293 Fig. 4. MMSE versus k comparison setting P= = 7 db among: EC-n, CE-n, C E-d EC-d L =1sensor with uncorrelated data generated from a linear data model. per sensor over noise power at the FC. This ratio, which we define as SNR,, captures the sensor FC channel noise effect each power-limited sensor should not be confused with the SNR we used in conjunction with the linear data model when the sensor FC channels were assumed ideal. The benchmark both correlated decoupled cases will be, which nonideal links is achievable only when SNR is very large, since is the MMSE attained with uncompressed sensor data transmitted over ideal links. To assess the difference in hling noise effects, we will also compare the EC-n Algorithm 2 estimators against those we derived in Sections III IV as well as those in [16], after properly modifying the latter to account the multiplicative channels. With denoting the matrices of schemes derived under a1), the needed modifications amount to replacing with with, where the scale ensures fair comparison with identical power constraints between estimators developed under (a1) this section s schemes derived under. We will denote these modified schemes which account the channel as EC-d, C E-d Algorithm 1-d. Our comparisons will further include an option that we will term CE-n, which relies on Theorem 4 to render the CE scheme we examined under (a1) applicable to the nonideal setup with fading FC noise present. CE-n compresses first observation data at the sensor reconstructs them at the FC, using respectively, which are found as in (35) after replacing with. This PCA step over nonideal links is followed by LMMSE estimation of based on the reconstructed vector. Test Case 4 (EC-n With Uncorrelated Sensor Data): We consider first the decoupled case of Section V-A whose MMSE permance is characterized by the single sensor setup we simulate here. Fig. 4 depicts MMSE versus the benchmark, the EC-n, CE-n, EC-d, C E-d estimators applied to sensor data generated according to the linear model of test case 1 (,, SNR ); i.e.,, with white uncorrelated with. The FC noise covariance is set to. The noise variance the power are selected Fig. 5. MMSE versus SNR comparison k =4among: EC-n, CE-n, C E-d EC-d L =1sensor with uncorrelated data generated from a linear model (a); a nonlinear model (b). such that SNR 7 db, where SNR SNR. The quantity SNR indicates the ratio between. Notice that as grows, SNR decreases but at the same time the dimensionality of the compressed data increases. As expected, benchmarks all curves, while the worst permance is exhibited by C E-d. Albeit suboptimal, CE-n comes close to the optimal EC-n. The monotonic decrease of MMSE with EC-n corroborates Corollary 4. Contrasting it with the increase EC-d exhibits in MMSE beyond a certain,we can appreciate the importance of coping with noise effects. This increase is justifiable since each entry of the compressed data in EC-d is allocated a smaller portion of the given power as grows. In EC-n, however, the quality of channel links the available power determine the number of the compressed components (which might lie in a vector space of dimensionality ), allocate power optimally among them. Fig. 5(a) compares MMSE of the estimators in Fig. 4 as a function of SNR at fixed with colored noise at

11 4294 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 Fig. 6. MMSE versus k comparison of Algorithm 2 against distributed alternatives common P = = 13 db, i = 1,2,3, across L =3sensors with correlated data generated from a linear model. the FC. The decaying MMSE behavior is reasonable since more power is allocated to compressed entries as SNR increases. The EC-n optimality is evident at low SNR values, but as SNR increases a1) becomes increasingly valid EC-d catches up with EC-n. Fig. 5(b) is the counterpart of Fig. 5(a) the same estimators but now applied to a nonlinear data model, which matrices are generated as in the linear model, but entries of are drawn from a stard normal distribution while making sure that remains positive semidefinite. A noticeable difference in this case is CE-n which, similar to C E-d, exhibits MMSE that does not decay with SNR, due to insufficient power. Test Case 5 (Algorithm 2 With Correlated Sensor Data): Based on a linear model as in test case 4, we generated correlated data vectors of sizes,, sensors. Noise is white with variance. The power are chosen such that SNR 13 db 1, 2, 3, with. Along with the benchmark, Fig. 6 depicts the MMSE as a function of the total number of compressed entries across sensors i) a centralized EC-n setup which a single (virtual) sensor has available the data vectors of all three sensors; ii) the estimator returned by Algorithm 2; iii) the decoupled EC-n estimator which ignores sensor correlations; iv) the Algorithm 1-d; v) the C E-d. [Recall that both iv) v) ignore noise but account fading effects.] Interestingly, our decentralized Algorithm 2 comes very close to the hypothetical single-sensor bound of the centralized EC-n estimator, while outperming the decoupled EC-n one. Also worth noting is that Algorithm 1-d perms close to Algorithm 2 small values of, but as increases it behaves as bad as C E-d. Fig. 7(a) (b) depicts the MMSE behavior as the SNR, common all sensors, is increased in the linear nonlinear cases, respectively. The FC noise is colored. For the linear case in Fig. 7(a), we set. Clearly, Algorithm 2 achieves an MMSE which is close to the lower bound of the centralized estimator as SNR increases, the gap shrinks. Again Algorithm Fig. 7. MMSE versus SNR comparison of Algorithm 2 against distributed alternatives applied to L =3 sensors with correlated data generated from: (a) a linear model with k =6; or, (b) a nonlinear model with k =8. 2 outperms the decoupled scheme, Algorithm 1-d C E-d. Notice, that high C E-d, Algorithm 1-d Algorithm 2 exhibit similar MMSE since the noise becomes negligible. For the nonlinear model, we set. Observe that, in contrast to the decoupled scheme [1], the MMSE corresponding to Algorithm 2 reduces as the SNR grows approaching the lower bound offered by the centralized EC-n scheme. VI. CONCLUSION We derived algorithms estimating stationary rom signals based on reduced-dimensionality observations collected by power-limited wireless sensors linked with a fusion center. We dealt with estimation both when the links are ideal, which requires use of sufficiently powerful channel codes, also when channel noise effects are directly accounted, which necessitates constraining the power per sensor. When data across sensors are uncorrelated, we established global mean-square error optimal schemes in closed-m proved

12 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4295 that they implement estimation followed by compression per sensor. For correlated sensor observations we developed algorithms that rely on block coordinate descent iterations which are guaranteed to converge at least to a local stationary point of the associate mean-square error cost. The optimal estimators when the links are nonideal allocate properly the prescribed power following a waterfilling-like principle to balance judiciously channel effects additive noise at the fusion center with the degree of dimensionality reduction that can be afded. Besides distributed estimation with reduced dimensionality sensor observations, such closed-m solutions are valuable when principal component canonical correlation analyses are applied to data observed in multiplicative additive noise. For correlated sensor data transmitted over ideal links, we also came up with a closed-m estimator identified its complexity-permance tradeoffs relative to the iterative one. We further delineated pros cons of two possible transmission scenarios the sensors compared corresponding optimal suboptimal estimators with existing (sub)optimal alternatives. Mean-square error permance of our novel estimators was evaluated both analytically with numerical examples. 2 APPENDIX are the smallest eigenvalues of. Because (42) implies that,, it follows that (43) The MMSE the CE scheme is [c.f. (6)]:, where. Since the linear model we have, it follows that (44) thus. Then,,wehave. For, it follows from (43) that. In the CE scheme, can be written as A. Proof of Equation (4) Since with, recalling that,wehave Clearly, we have. Thus,,. Q.E.D. C. Proof of Corollary 2B) Since, it suffices to show that. Consider the SVD, where, is a diagonal matrix, with. We can write B. Proof of Corollary 2A) Let denote the singular value decomposition (SVD) of, where is an matrix with only nonzero diagonal entries. Since, wehave under the linear model Consider now the eigendecomposition (45) where is med by the first columns of. Continuing, we have We consider first the case, where EC scheme is [c.f. (8)]: (42). The MMSE the, where 2 The views conclusions contained in this document are those of the authors should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government. where. From (45) (46), we have. Then, we have, where comprises the last (46)

13 4296 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 columns of. Then, can be written as D. Proof of Equation (15) Q.E.D. Let us define, rewrite the cost function in (9) as. Supposing temporarily that,, are given, we can express the minimizing as the LMMSE operator estimating based on. Recalling the definitions of detailing their auto- cross-covariances, we find.if contains distinct diagonal elements, then the last equation holds true only if is diagonal, which implies that the matrix must also be diagonal. Substituting the term next to the inverse in the right-h side of (29) using the expression in (28), rearranging terms we arrive at (52) Equation (52) implies that the matrix must also have diagonal structure, since are diagonal. Q.E.D. G. Proof of Property 2 The partial derivative of (30) with regard to,is, (47) Because of (47), there is no need to separately optimize (9) with respect to since the MSE optimal matrix can be obtained once the optimal matrices,, have been found. Our next step is to substitute (47) into (9), re-group terms so that terms containing the product are isolated. These steps lead us to (15). Q.E.D. E. Proof of Equation (27) Using (26), we can write On the other h, takes the m (53), this partial derivative (54) (48) (49) If denotes the associated Lagrange multiplier, the KKT conditions at the global minimum of (30) dictate Substituting (48) (49) into (24), we arrive at (27) after simple manipulations. Q.E.D. F. Proof of Property 1 Multiplying in (28) from the left with from the right with, we obtain (50) Likewise, multiplying in (29) from the left with from the right with, we can write (51) Equating the left-h sides of (50) (51), we obtain. The symmetry of implies that (55) (56) Let us assume that has at least one distinct diagonal entry, say the one. 3 Considering the th row of, if both are nonzero, then (56) to hold 3 If all eigenvalues are identical, then we can perturb 3 by 3 construct the matrix ~ 3 =3 + 3 which has distinct eigenvalues. Because the optimal entries [c.f. Theorem 4] are continuous functions of 3,we have that lim ( ~ 3 )= (3 ).

14 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4297, we must set the quantity inside parentheses in the r.h.s. of (53) equal to zero. But this implies that Simple inspection reveals that (60) is satisfied when either (57) which is clearly impossible. Thus, the submatrix of constructed by its first columns can have only one nonzero entry per row. Excluding the trivial case with all-zero entries, since must have at least one nonzero entry, say in the throw, we deduce from (56) (54) that in order to ensure,we must have. Thus, the th row of contains only one nonzero entry, say in the th column, where. Focusing now w.l.o.g. on the class of matrices with only one nonzero entry in the th row, we can write the partial derivative in (53) with respect to as (58) (61) The second equation in (61) provides a feasible (non-negative) solution only when. Moreover, there exists a such that, with. Then, (60) is satisfied by both choices in (61). But, (60) is satisfied when. So far, we have identified what m the optimal matrix must have but we have not shown yet if such a matrix minimizes (31) or equivalently (32). For this purpose, we need to check under what conditions the Hessian is positive semidefinite. The Hessian of (31) is a diagonal matrix with diagonal entries, as follows: The fact that implies a couple of things: i) the power constraint has to be active, meaning that ii) the entries must be zero, so that the partial derivative in (54) is zero thus (56) is satisfied. In order to achieve the smallest possible MSE, the nonzero entry in each row must be contained within the first columns of, so that the largest eigenvalues are subtracted in (30). We have established so far that every row of can have at most one nonzero entry. From this property we infer that each column can have at most one nonzero element, otherwise Property 1 is violated. Notice that if, then the rows of with indexes exceeding can only contain nonzero entries within the columns due to the fact that. Thus, there must be at least rows with all their entries equal to zero. These rows should be the last ; otherwise, we would cancel out terms involving the resulting in a higher MSE. Since, we conclude that the MMSE does not decrease, which, in turn, suggests that it is of interest to consider only. Q.E.D. Upon substituting in (62), we obtain which is nonpositive, since. Likewise, upon substituting we obtain (62) H. Proof of Equation (33) The KKT conditions yield, which is strictly positive since the entries are distinct. For, we set the corresponding entry of the Hessian is (59) (60) which is also strictly positive since,. Thus, having verified the positive semidefiniteness of the Hessian, we are assured that the diagonal matrix, with diagonal entries [see (63), shown at the top of the next page] minimizes (31). Recall now from the proof of Property 2 that the power constraint has to be active, i.e.,.

15 4298 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 8, AUGUST 2007 (63) To this end, is chosen to satisfy. Continuing, we wish to specify the that minimize (31). In order to minimize (31), we have to maximize the third term, which is possible by incorporating in it terms involving ; hence,. Since values do affect the minimization of the MSE, we can set w.l.o.g.. Using (63), the third term in (31) can be written as, with given by (33),. Thus, we have, where is an diagonal matrix. We can now verify that (67) where (64) which completes the proof since [c.f. (66)]. Q.E.D. J. Proof of Corollary 4 If satisfies (37), the maximum rank that the optimal compression matrix can have is. Since, we only need to consider. In this case, the matrices have full rank, the resultant MMSE is [c.f. (36)] The first term in (64) is maximized by selecting, w.l.o.g. we can set. The second term in (64) is minimized after setting because, it holds that (65) any permutation of the indexes. Now w.l.o.g., one can take without affecting the MSE. Theree, the MSE is minimized by setting, setting as in (63). [Recall also that.] I. Proof of Corollary 3 We have from Theorem 4 that. Now, the LMMSE estimation matrix can be written as, where. Using the SVD, we can write (66) where is the covariance matrix of, whose eigenvalues are. With denoting the th diagonal entry of, we infer from (66) that,. It follows then easily that, where is a diagonal matrix whose th diagonal entry is To prove that (68)] (68), it suffices to show that [c.f. (69) The validity of (69) follows readily from the Cauchy Schwartz inequality (see, e.g., [6, p. 53]). REFERENCES [1] D. P. Bertsekas, Nonlinear Programming, 2nd ed. Belmont, MA: Athena Scientific, [2] D. Blatt A. Hero, Distributed maximum likelihood estimation sensor networks, in Proc. Int. Conf. Acoustics., Speech, Signal Processing (ICASSP), Montreal, QC, Canada, May 2004, vol. 3, pp [3] S. Boyd L. Venberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, [4] D. R. Brillinger, Time Series: Data Analysis Theory, 2nd ed. San Francisco, CA: Holden-Day, [5] M. Gastpar, P. L. Draggoti, M. Vetterli, The distributed Karhunen Loève transm, IEEE Trans. Inf. Theory, vol. 52, no. 12, pp , Dec [6] G. H. Golub C. F. Van Loan, Matrix Computations, 3rd ed. Baltimore, MD: The John Hopkins Univ. Press, [7] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory. Englewood Cliffs, NJ: Prentice-Hall, 1993.

SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4299 [8] Z.-Q. Luo, Universal decentralized estimation in a bwidth constrained sensor network, IEEE Trans. Inf.

Inmation Theory, Adelaide, Australia, Sep. 4 9, 2005, pp. 1441 1445. [10] V. Megalooikonomou Y.

Kusuma, K. Ramchran, Distributed compression in a dense microsensor network, IEEE Signal Process. Mag., vol. 19, pp. 51 60, May 2002. [12] A. Ribeiro G. B.

16 SCHIZAS et al.: DISTRIBUTED ESTIMATION USING REDUCED-DIMENSIONALITY SENSOR OBSERVATIONS 4299 [8] Z.-Q. Luo, Universal decentralized estimation in a bwidth constrained sensor network, IEEE Trans. Inf. Theory, vol. 51, pp , Jun [9] Z.-Q. Luo, G. B. Giannakis, S. Zhang, Optimal linear decentralized estimation in a bwidth constrained sensor network, in Proc. Int. Symp. Inmation Theory, Adelaide, Australia, Sep. 4 9, 2005, pp [10] V. Megalooikonomou Y. Yesha, Quantizer design distributed estimation with communication constraints unknown observation statistics, IEEE Trans. Commun., vol. 48, no. 2, pp , Feb [11] S. S. Pradhan, J. Kusuma, K. Ramchran, Distributed compression in a dense microsensor network, IEEE Signal Process. Mag., vol. 19, pp , May [12] A. Ribeiro G. B. Giannakis, Bwidth-constrained distributed estimation wireless sensor networks, Part II: Unkown PDF, IEEE Trans. Signal Process., vol. 54, no. 7, pp , Jul [13] I. D. Schizas, G. B. Giannakis, N. Jindal, Distortion-rate analysis distributed estimation with wireless sensor networks, presented at the Proc. 43th Allerton Conf. Communication, Control, Computing, Monticello, IL, Sep. Oct [14] E. Song, Y. Zhu, J. Zhou, Sensors optimal dimensionality compression matrix in estimation fusion, Automatica, vol. 41, pp , Nov [15] K. Zhang, X. R. Li, P. Zhang, H. Li, Optimal linear estimation fusion Part VI: Sensor data compression, in Proc. Int. Conf. Inmation Fusion, Queensl, Australia, 2003, vol. 23, pp [16] Y. Zhu, E. Song, J. Zhou, Z. You, Optimal dimensionality reduction of sensor data in multisensor estimation fusion, IEEE Trans. Signal Process., vol. 53, no. 5, pp , May Ioannis D. Schizas (S 07) received the Diploma (with Honors) degree in computer engineering inmatics from the University of Patras, Patras, Greece, in Since August 2004, he has been working towards the Ph.D. degree at the Department of Electrical Computer Engineering, University of Minnesota, Minneapolis. His research interests lie in the areas of communication theory, signal processing, networking. His current research focuses on distributed signal processing with wireless sensor networks distributed compression source coding. Georgios B. Giannakis (F 97) received the Diploma degree in electrical engineering from the National Technical University of Athens, Greece, 1981 the M.Sc. degree in electrical engineering, the M.Sc. degree in mathematics, the Ph.D. degree in electrical engineering from the University of Southern Calinia (USC), Los Angeles, in 1983, 1986, 1986, respectively. Since 1999, he has been a Professor with the Electrical Computer Engineering Department of the University of Minnesota, Minneapolis, where he now holds an ADC Chair in Wireless Telecommunications. His general interests span the areas of communications, networking, statistical signal processing subjects on which he has published more than 250 journal papers, 450 conference papers, two edited books, two upcoming research monographs on Space Time Coding Broadb Wireless Communications (Wiley, 2006) Ultra-Wideb Wireless Communications (Cambridge Press, 2007). Current research focuses on diversity techniques, complex-field space time coding, multicarrier, cooperative wireless communications, cognitive radios, cross-layer designs, mobile ad hoc networks, wireless sensor networks. Dr. Giannakis is the (co)recipient of six paper awards from the IEEE Signal Processing (SP) Communications societies, including the G. Marconi Prize Paper Award in Wireless Communications. He also received Technical Achievement Awards from the SP Society (2000), from EURASIP (2005), a Young Faculty Teaching Award, the G. W. Taylor Award Distinguished Research from the University of Minnesota. He has served the IEEE in a number of posts. Zhi-Quan Luo (F 08) received the B.Sc. degree in mathematics from Peking University, China, in 1984 the Ph.D. degree in operations research from the Department of Electrical Engineering Computer Science, Massachusetts Institute of Technology, Cambridge, in During the academic year , he was with Nankai Institute of Mathematics, Tianjin, China. In 1989, he joined the Department of Electrical Computer Engineering, McMaster University, Hamilton, ON, Canada, where he became a Professor in 1998, the department head in 2001, held the Canada Research Chair in Inmation Processing. Since April 2003, he has been a Professor with the Department of Electrical Computer Engineering holds an endowed ADC research Chair in Wireless Telecommunications with the Digital Technology Center at the University of Minnesota, Minneapolis. His research interests lie in the union of large-scale optimization, data communications, signal processing. Prof. Luo serves on the IEEE Signal Processing Society Technical Committees on Signal Processing Theory Methods (SPTM) on the Signal Processing Communications (SPCOM). He is a corecipient of the 2004 IEEE Signal Processing Society s Best Paper Award, has held editorial positions several international journals, including the SIAM Journal on Optimization, Mathematics of Computation, Mathematics of Operations Research, the IEEE TRANSACTIONS ON SIGNAL PROCESSING.

5682 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

5682 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Hyperplane-Based Vector Quantization for Distributed Estimation in Wireless Sensor Networks Jun Fang, Member, IEEE, and Hongbin