X 1,n. X L, n S L S 1. Fusion Center. Final Decision. Information Bounds and Quickest Change Detection in Decentralized Decision Systems

Size: px

Start display at page:

Download "X 1,n. X L, n S L S 1. Fusion Center. Final Decision. Information Bounds and Quickest Change Detection in Decentralized Decision Systems"

Wilfrid Whitehead
5 years ago
Views:

1 1 Information Bounds and Quickest Change Detection in Decentralized Decision Systems X 1,n X L, n Yajun Mei Abstract The quickest change detection problem is studied in decentralized decision systems, where a set of sensors receive independent observations and send summary messages to the fusion center, which makes a final decision. In the system where the sensors do not have access to their past observations, the previously conjectured asymptotic optimality of a procedure with a Monotonic Likelihood Ratio Quantizer MLRQ is proved. In the case of additive Gaussian sensor noise, if the signal-to-noise ratios SNR at some sensors are sufficiently high, this procedure can perform as well as the optimal centralized procedure that has access to all the sensor observations. Even if all SNRs are low, its detection delay will be at most π/2 1 57% larger than that of the optimal centralized procedure. Next, in the system where the sensors have full access to their past observations, the first of asymptotically optimal procedures in the literature is developed. Surprisingly, the procedure has same asymptotic performance as the optimal centralized procedure, although it may perform poorly in some practical situations because of slow asymptotic convergence. Finally, it is shown that neither past message information nor the feedback from the fusion center improves asymptotic performance in the simplest model. Index Terms Asymptotic optimality, CUSUM, multi-sensor, quantization, sensor networks, sequential detection I. INTRODUCTION The problem of quickest change detection has a variety of applications including industrial quality control, reliability, fault detection, and signal detection. The classical or centralized version of this problem, where all observations are available at a single, central location, is a well-developed area see, e.g., [1], [7], [17]. Recently this problem has been applied in decentralized, or distributed decision systems, which have many important applications, including multisensor data fusion, mobile and wireless communication, surveillance systems, and distributed detection. Figure I illustrates the general setting of decentralized decision systems. In such a system, at time n, each of a set of L sensors S l receives an observation X l,n, and then sends a sensor message U l,n to a central processor, called the fusion center, which makes a final decision when observations are stopped. In order to reduce cost and increase reliability, it is required that the sensor messages belong to a finite alphabet perhaps binary. This limitation is dictated in practice by the need for data compression and limitations of communication bandwidth. In [23] and [25], the authors considered two different scenarios of decentralized decision systems, depending on how local information is used at the sensors. One scenario is the system with limited local memory, where the sensors do not have access to their past observations. This scenario has the following three possible cases, which correspond to Cases A, C and E in [23] and [25]. Case i System with Neither Feedback from the Fusion Center nor Local Memory: U l,n φ l,n X l,n. 1 Manuscript received November 21, 2002; revised November 10, This work was supported in part by NIH Grant R01 AI The material in this correspondence was presented in part at the IEEE International Symposium on Information Theory, Chicago, IL, USA, The author was with the Department of Mathematics, California Institute of Technology, Pasadena, California, USA. He is now with the Department of Biostatistics, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA ymei@fhcrc.org. Publisher Item Identifier Fig. 1. S 1 U 1,n Possible Feedback Fusion Center Final Decision General setting for decentralized decision systems S L U L,n Case ii System with no Feedback and Local Memory Restricted to Past Sensor Messages: U l,n φ l,n X l,n ; U l,[1,n 1], 2 where U l,[1,n 1] U l,1, U l,2,..., U l,n 1. Case iii System with Full Feedback, but Local Memory Restricted to Past Sensor Messages: U l,n φ l,n X l,n ; U 1,[1,n 1], U 2,[1,n 1],..., U L,[1,n 1]. 3 The other scenario is the system with full local memory, where the sensors have full access to their past observations. There are two possible cases, which correspond to Cases B and D in [23] and [25]. Case iv System with no Feedback and Full Local Memory: U l,n φ l,n X l,[1,n], 4 where X l,[1,n] X l,1, X l,2,..., X l,n. Case v System with Full Feedback and Full Local Memory: U l,n φ l,n X l,[1,n] ; U 1,[1,n 1], U 2,[1,n 1],..., U L,[1,n 1]. 5 In decentralized quickest change detection problems, it is assumed that at some unknown time ν, the distributions of the sensor observations X l,n change abruptly and simultaneously at all sensors. The goal is to detect the change as soon as possible over all possible protocols for generating sensor messages and over all possible decision rules at the fusion center, under a restriction on the frequency of false alarms. As in the classical or centralized quickest change detection problem, there are two standard mathematical formulations. The first one is a Bayesian formulation, due to Shiryayev [19], in which the change-point ν is assumed to have a known prior distribution. It is well known [24], [25] that Bayesian formulations prove to be intractable and the dynamic programming arguments cannot be used except in the special case specified in 5, where the Bayesian solution [24] is too complex to implement. The second is a minimax formulation, proposed by Lorden [11], in which the change-point ν is assumed to be unknown possibly but non-random. Papers [2] and [21] used this approach to study the simplest case specified in 1, but both have restrictions on the class of sensor message protocols. In this correspondence, we use the second of these formulations to develop an asymptotic theory of decentralized quickest change $00.00 c 2002 IEEE

2 2 detection problems, giving in both scenarios procedures that are asymptotically optimal and easy to implement. It is worthwhile highlighting that our asymptotically optimal procedures do not use past message information, and hence past message information or the feedback from the fusion center does not improve asymptotic performance. Throughout this correspondence, we make the following assumptions, which are standard: A1 The sensor observations are independent over time as well as from sensor to sensor. A2 The densities of the sensor observations are either f 1,..., f L or g 1,..., g L, where the f s and g s are given. For each 1 l L, the Kullback-Leibler information number or relative entropy gl x Ig l, f l log g l xdx 6 f l x is finite and positive, and log g lx 2 g l xdx <. 7 f l x In Section II, we provide a formal mathematical formulation of the problem and introduce some notations. In Section III, under a condition on second moments, we prove that a procedure with a monotone likelihood ratio quantizer MLRQ is asymptotically optimal in the system with limited local memory. We also establish sufficient conditions for our theorems to be applied. Section IV develops asymptotic theory in the system with full local memory and offers asymptotic optimal procedures which are easy to implement. In Section V, we compare these asymptotically optimal decentralized procedures with the optimal centralized procedure that has access to all the sensor observations. Section VI gives simulation results for several illustrative examples. The proofs of all theorems are given in the Appendix. II. PROBLEM FORMULATION AND NOTATION Suppose there are L sensors in a system. At time n, an observation X l,n is made at each sensor S l. Assume that at some unknown possibly time ν, the density function of the sensor observations {X l,n } changes simultaneously for all 1 l L from f l to g l. That is, for each 1 l L, the observations at sensor S l, X l,1, X l,2,... are independent random variables such that X l,1, X l,2,..., X l,ν 1 are i.i.d. with density f l and X l,ν, X l,ν+1,... are i.i.d. with density g l. Furthermore, it is assumed that the observations are independent from sensor to sensor. Denote by P ν and E ν the probability measure and expectation when the change occurs at time ν, and denote the same by P and E when there is no change. Based on the information available at S l at time n, a message U l,n, specified in 1 5, is chosen from a finite alphabet and is sent to a fusion center. Without loss of generality, we assume that U l,n takes a value in {0, 1,..., D l 1}. The fusion center uses the stream of messages from the sensors as inputs to make a decision whether or not a change has occurred. Mathematically, the fusion center decision rule is defined as a stopping time τ with respect to {U 1,n, U 2,n,..., U L,n} n 1. The interpretation of τ is that, when τ n, we stop taking observations at time n and declare that a change has occurred somewhere in the first n observations. For each choice of sensor message functions and fusion center decision rule, a reasonable measure of quickness of detection is the following worst case detection delay defined in Lorden [11], [ ] E 1 τ sup ess sup Eν τ ν+1 + X1,[1,ν 1],..., X L,[1,ν 1]. ν 1 The desire to have small E 1τ must, of course, be balanced against the need to have a controlled frequency of false alarms. In other words, when no change occurs, τ should be large, hopefully infinite. However, Lorden [11] showed that if E 1τ is finite, then E τ is finite, which implies P τ < 1. Thus we will have a false alarm with probability 1 when there is no change. An appropriate measurement of false alarms, therefore, is E τ, the mean time until a false alarm. Imagining repeated application of such procedures, practitioners refer to the frequency of false alarms as 1/E τ and the mean time between false alarms as E τ. Our problem can then be stated as follows: Design the sensors message function φ l,n and seek a stopping time τ at the fusion center that minimizes E 1τ subject to E τ γ, 8 where γ is a given, fixed lower bound. The worst-case detection delay E 1τ can be replaced by the average detection delay, proposed by Shiryayev [20] and Pollak [15], sup E ν τ ν τ ν. ν 1 Although the worst-case detection delay is always greater than the average detection delay, they are asymptotically equivalent. Either one can be used in our theorems. It is well-known [13] that the exactly optimal solutions for this problem in the centralized version are Page s CUSUM procedures, defined by the stopping times T a inf { n : W n a }, 9 where the CUSUM statistic n L W n max log g lx l,i, 1kn f l X l,i ik which can be calculated recursively as W n max W n 1, 0 + log g lx l,n f l X l,n 10 for n 1 and W 0 0. In the literature, T a is also usually defined as the first n for which maxw n, 0 a. These two definitions are equivalent if the threshold a > 0, but there is a difference if a 0, also see [13]. Unfortunately, in decentralized decision systems, it is nearly impossible to find exactly optimal solutions for some special cases see [24], and only asymptotic optimality results seem to be working. In the asymptotic optimality approach, we typically first construct an asymptotic lower bound of E 1τ as γ goes to. Then we show that a given class of procedures attains the lower bound asymptotically. We will establish asymptotic optimality theorems for both scenarios of decentralized decision systems: limited local memory specified in 1-3 and full local memory specified in 4 and 5. We now introduce some notations. Let D be a positive integer. Consider a random variable Y whose density function is either f or g with respect to some σ-finite measure, and assume that the Kullback- Leibler information number Ig, f is finite. For a deterministic or random measurable function φ from the range of Y to a finite alphabet of size D, say {0, 1,..., D 1}, denote by f φ and g φ respectively the probability mass function of φy when the density of Y is f and g. Let Z φ log g φφy f φ φy,

3 3 and define and I D g, f sup E g Z φ 11 φ V D g, f sup E g Z φ φ It is well known [22] that I D g, f Ig, f, i.e., that reduction of the data from Y to φy cannot increase the information. A more detailed analysis between I D g, f and Ig, f is provided in Section V. Tsitsiklis [22] showed that the supremum I D g, f is achieved by a Monotone Likelihood Ratio Quantizer MLRQ ϕ of the form ϕy d if and only if λ d gy fy < λ d+1, where 0 λ 0 λ 1 λ D 1 λ D are constants. These optimal MLRQ s are not easily calculated, but we follow the standard practice in the literature of developing procedures that assume sensor messages are constructed optimally in the sensor. Some of our theorems assume that V D g, f <. A sufficient condition for finiteness of V 2g, f is given in Section III. Using these notations, define the information numbers I D I Dl g l, f l, 13 where D D 1, D 2,..., D L, and Ig l, f l. 14 These two information numbers are key to our theorems. III. LIMITED LOCAL MEMORY A. Page s CUSUM procedure with the MLRQ For the decentralized decision system with limited local memory, specified in 1-3, the following procedure Na has been studied in the literature: Each sensor S l uses the optimal MLRQ ϕ l. Namely, U l,n ϕ l X l,n d if and only if λ l,d g lx l,n f l X l,n < λ l,d+1, where 0 λ l,0 λ l,1 λ l,dl 1 λ l,dl are optimally chosen in the sense that the Kullback-Leibler information number Ig ϕ,l, f ϕ,l achieves the supremum I Dl g l, f l. Here f ϕ,l and g ϕ,l are the probability mass functions induced on U l,n when the observations X l,n are distributed as f l and g l, respectively. Based on the i.i.d. vector observations U n U 1,n,..., U L,n, the fusion center then uses Page s CUSUM procedure with loglikelihood ratio boundary a to detect whether or not a change has occurred, i.e., the stopping time Na is given by Na inf { n : Ŵ n a }, 15 where Ŵ0 0 and for n 1, 2,..., Ŵ n max Ŵ n 1, 0 + log g ϕ,lu l,n f ϕ,l U l,n. It was shown in [2] that Na is optimal in the sense that at each sensor, the MLRQ ϕ is optimized, i.e., maximizes the Kullback-Leibler information number Ig ϕ, f ϕ. Later [21] proved the asymptotic optimality property of Na in the simplest case specified in 1 under the restriction that the sensor message functions {φ 1,..., φ L } satisfy the following stationary condition: For all ν 1, 2,..., as n goes to, n 1 ν+n L Z iν l,i converges in probability under P ν to some positive constant number, where Z l,i log g φ,l U l,i /f φ,l U l,i. Paper [24] conjectured that Na is asymptotically optimal in the special case specified in 5, because numerical simulations illustrate that it has performance similar to the Bayesian solutions. In the next subsection we will show that under a condition on second moments, Na is asymptotically optimal without any restriction on the sensors message functions or the fusion center decision rule in the system with limited local memory. B. Asymptotic Optimality of Na We begin our analysis by studying the performance of the procedure Na. Observe that Na, defined in 15, is Page s CUSUM procedure, so that by applying the standard bounds [17], Lemma 1: and as a, E Na e a, E 1 Na a I D + O1. The following theorem is of fundamental importance for proving asymptotic optimality of Na. It establishes the asymptotic lower bounds for the detection delays of any procedures in the system with limited local memory. Theorem 1: Assume V Dl g l, f l, defined in 12, is finite for all 1 l L. If {τγ} is a family of procedures in the system with limited local memory satisfying 8, then E 1 τγ 1 + o1 log γ I D 16 as γ, where I D is defined in 13. Now we can summarize our results on the asymptotic optimality of the procedure Na as follows. Corollary 1: For γ > 1 let a log γ, then Na satisfies 8 and E 1 Na log γ I D + O1, so that under the assumption of finiteness of V Dl g l, f l for all 1 l L, the procedure Na asymptotically minimizes the detection delay E 1Na as γ in the system with limited local memory. Note that paper [21] established a result similar to 16 in the simplest case specified in 1 under a restriction on the sensor message functions. Theorem 1 provides different sufficient conditions under which the asymptotic lower bounds 16 could be established. Our sufficient conditions are new and perhaps the most useful, since they do not impose any restrictions on the sensors message functions or the fusion center decision rules. Moreover, they also allow us to obtain the asymptotic optimality property of Na in all three cases of the system with limited local memory. C. Sufficient Conditions In Theorem 1, we assume V D g, f <, which is usually not easy to verify. The following theorem and its corollary give some sufficient conditions to verify it when D 2. Theorem 2: Suppose fy and gy are two densities such that E g log gy 2 log gy 2 gydy <. fy fy

4 4 Define At P f gy fy > t, Bt P g gy fy > t. Assume At and Bt are continuous functions of t on 0, and take values 0 and 1 for the same t. Moreover, assume that lim sup Bt log At <, 17 t and lim sup 1 At log1 Bt <, 18 t 0 where 0 log 0 is interpreted as 0. Then V 2g, f <. Corollary 2: Suppose the distribution of the random variable Y belongs to a one-parameter exponential family having the continuous densities f θ y exp{θy bθ}, < y <, θ Ω with respect to some σ-finite measure, where Ω is an open interval on the real line and bθ is twice differentiable with respect to θ. Let F θ y denote the distribution function of Y. Consider θ 0 < θ 1 in Ω, and let f i f θi and F i F θi for i 0, 1. Define y 0 sup{y : F 0y 0} and y 1 inf{y : F 1y 1}. If F 0 y 3/2 1 F 1 y 3/2 lim < and lim <, y y 0 f 0y y y 1 f 1y then both V 2 f 0, f 1 and V 2 f 1, f 0 are finite. Proof: Since f 1y/f 0y is a monotonically increasing function of y, it suffices for V 2 f 1, f 0 < to show equations 17 and 18 hold for At 1 F 0 log t and Bt 1 F 1 log t, which is straightforward using L Hôpital s Rule. The proof is identical for V 2 f 0, f 1. It is easy to check that two Gaussian distributions with same variance satisfy the conditions in Corollary 2, and so do two exponential distributions. Therefore, if the sensors are restricted to send binary messages to the fusion center, and the pre-change and post-change distributions at each sensor are two Gaussian distributions with same variance or two exponential distributions, then the procedure Na is asymptotically optimal over all possible sensor messages and all possible fusion center decision rules in the system with limited local memory. IV. FULL LOCAL MEMORY It has been an open problem to find asymptotically optimal procedures including both the sensor and fusion center decision rules in the decentralized decision system with full local memory, specified in 4 and 5, although it is well-known [25] that Bayesian formulations become intractable. We will address this problem in this section. To establish lower bounds for the detection delay in the system with full local memory is not difficult. By the optimality of Page s CUSUM procedures in the centralized version [11], [13], [17], we have Lemma 2: If {τγ} is a family of procedures in the system with full local memory such that 8 holds, then as γ, E 1 τγ log γ + O1. 19 In the centralized version the lower bounds 19 are sharp and can be achieved by Page s CUSUM procedure T a defined in 9. Theorem 1 shows that these lower bounds are too crude in the system with limited local memory. However, it is not clear whether they are sharp in the system with full local memory. In other word, can we find procedures in the system with full local memory for which these bounds are achieved asymptotically? Since we expect to sacrifice some performance by quantizing the data locally instead of utilizing the complete data set at the fusion center, it is perhaps surprising that we give an affirmative answer by constructing such procedures. A. The Structure of Procedures For the system with full local memory, our proposed procedure Ma is as follows: For each sensor S l, one considers whether or not the CUSUM statistic W l,n max 1kn n ik log g lx l,i f l X l,i exceeds the constant boundary π l a, where π l 20 Ig l, f l L Ig l, f l Ig l, f l. 21 That is, for each l 1,..., L, and n 1, 2,..., define the sensor messages { 1, if Wl,n π U l,n l a; 0, otherwise. The fusion center then combines all these sensor decisions U l,n by using an AND rule, i.e., it stops and decides a change has occurred as soon as U l,n 1 for all l 1, 2,..., L. This stopping time Ma can be written as Ma inf { n 1 : W l,n π l a for all l 1, 2,..., L }. 22 It is easy to see that in single-sensor systems, our procedure Ma coincides with the optimal centralized procedure T a, defined in 9. Similar to T a, it is very convenient to implement Ma because the CUSUM statistic W l,n obeys the recursive relation W l,n max { W l,n 1, 0 } + log g lx l,n f l X l,n, where W l,0 0. However, unlike T a, our procedure Ma requires that each sensor shall continue sending the local messages to the fusion center even after the CUSUM statistic exceeds the local threshold. This essential feature can be seen from the following heuristic argument, which provides the motivation of Ma. Consider the optimal centralized procedure T a, defined in 9. If ν is the true change-point and n ν is sufficiently large, then n L W n max log g lx l,i n L log g lx l,i, 1kn f l X l,i f l X l,i and ik W l,n max 1kn n ik log g lx l,i f l X l,i iν n iν log g lx l,i f l X l,i. Thus W n L W l,n, and so under P ν, the stopping rule of the optimal centralized procedure T a is roughly equivalent to { L } W l,n a 23 for sufficiently large a. Now the strong law of large numbers implies that n ν 1 W l,n Ig l, f l with probability 1, so the weight of W l,n in the sum is roughly Ig l, f l / L Ig l, f l π l. Thus 23 can be approximated by {W l,n π l a for all 1 l L}, which is exactly the stopping rule of our procedure Ma.

5 5 B. Asymptotic Optimality The following theorem, whose proof is substantially complicated, establishes the asymptotic properties of our procedure Ma for large values of a. Theorem 3: As a, E 1 Ma a a + C + o1, 24 where is defined in 14 and C > 0 is a constant depending on L and the densities f l and g l. Furthermore, if we assume g l x log g lx 3 dx < 25 f l x for each 1 l L, then as a. E Ma 1 + o1e a. 26 Remark 1: Under additional reasonable conditions, it follows from nonlinear renewal theory that the smallest constant C in 24 is given by σ l C E max 1lL Ig l, f l Z l, 27 where σl 2 Var gl loggl X/f l X and Z 1,..., Z L are independent standard Gaussian variables. The proof is same as that of Theorem 3.3 in [4]. Also see Lemma 1 in [3]. Remark 2: For each sensor, the mean time between false alarms is expπ l a. By the renewal property of the CUSUM statistics, the mean time between false alarms for the fusion center is of order L expπ la expa since we continue sending local messages. See the Appendix for the rigorous proof. As in [18], the key idea is Lemma 6 in the Appendix. Remark 3: Lemma 6 in the Appendix indicates that our procedure Ma has the same pleasant property as the procedure Na in 15 and Page s CUSUM procedure T a in 9: the mean time between false alarms is approximately exponentially distributed. Remark 4: It is important to emphasize that in the definition of our procedure Ma in 22, we cannot replace the CUSUM statistics W l,n by the log-transformed Shiryayev-Roberts statistics log n n gl X k1 ik l,i /f l X l,i : in that case the mean time between false alarms is roughly expmax L π l a, which is much smaller than expa as a. Now the asymptotic optimality of our procedure Ma follows at once from Theorem 3 and Lemma 2: Corollary 3: There exists a log γ + o1 so that Ma satisfies 8 and E 1 Ma log γ log γ + C + o1. Thus Ma minimizes the detection delay up to O log γ among all procedures in the system with full local memory satisfying 8. V. COMPARISON OF THREE PROCEDURES In this section we compare our asymptotically optimal decentralized procedures with the optimal centralized procedure. As in [2], for a decentralized procedure τγ satisfying 8, define the decentralized penalty function DPF DP F τ γ E1τγ nγ 1, 28 where nγ is the detection delay of the optimal centralized procedure satisfying 8. Intuitively, DP F τ can be thought of as a measure which reflects the relative performance degradation for using decentralized procedure τ instead of the optimal centralized procedure. By Corollary 1 and relation 19, we immediately have Proposition 1: The DP F function of the procedure Na, defined in 15, is given by DP F N γ Itot O. 29 I D log γ It is, therefore, natural to study the relation between I D and. By definition, it suffices to study the relation between I D g, f and Ig, f for a pair of densities f, g. However, little research has been done on finding good lower bounds for I Dg, f/ig, f, although it is well-known that the upper bound is 1. In the following we study the special case of Gaussian distributions when D 2. The idea can be easily extended to non-gaussian distributions. Proposition 2: Suppose fy and gy are two Gaussian distributions with respective mean µ 0 and µ 1 and same variance σ 2. Let ρ µ 1 µ 0 2 /2σ 2 denote the signal-to-noise ratio SNR, then lim inf ρ 0 I 2 g, f Ig, f 2 π I 2 g, f and lim ρ Ig, f 1. Proof: Without loss of generality, we assume µ 0 0 and σ 1. First note that Ig, f ρ in this case. Next, since the likelihood ratio gy/fy is a monotonically increasing function, the MLRQ can be written as { 1, Y λ; U 0, otherwise. Thus the Kullback-Leibler information number for U is rλ h Φλ µ 1, Φλ, 30 where Φ is the distribution function of a standard Gaussian random variable and ha, b a loga/b + 1 a log1 a/1 b. Now let λ kµ 1. Fix k, it is straightforward to show that rkµ 1 Ig, f { 2/π, as ρ 0; k 2 1{0 < k < 1}, as ρ, where 1{A} is the indicator of the event A. The proposition follows at once from the fact rk I 2g, f Ig, f for any k. In the decentralized decision system with Gaussian sensor observations where the SNRs at some sensors are sufficiently high, we have I D because those sensors with high SNRs will contribute most to and I D. Hence the procedure Na will perform as well as the optimal centralized procedure. Even if all SNRs are very low, the DPF function of Na will be at most π/2 1 57% for large values of the mean time between false alarms. That is, the detection delay of Na will be at most 57% larger than that of the optimal centralized procedure. In other words, the procedure Na will take at most 57% more observations from the post-change distributions than the optimal centralized procedure. Moreover, the number of sensors does not have much effect on the DPF function of Na. Furthermore, Proposition 2 motivates us to conjecture that for Gaussian distributions, I 2g, f/ig, f is an increasing function of SNR ρ with the range [2/π, 1]. We do not have a rigorous proof, however, numerical results support our conjecture. Now for the procedure Ma in the system with full local memory, by Corollary 3,

6 6 Proposition 3: The DP F function of the procedure Ma, defined in 22, satisfies Itot DP F M γ C + o1, 31 log γ where the constant C depends on L and the densities f l and g l. It is easy to see that the DPF function of Ma is 0 as γ goes to. That is, Ma can perform as well as the optimal centralized procedure in any systems if γ is sufficiently large. Unfortunately, the asymptotic convergence of Ma is so slow that Ma may perform very far from the optimum for realistic values of the mean time between false alarms in some systems. As an illustration, let us consider the symmetric Gaussian system where for each l, f l and g l are Gaussian distributions with respective mean µ 0 and µ 1 and same variance σ 2. In this case, by 27, C 2L E max 1lL Z l where Z 1,..., Z L are independent standard Gaussian variables. Thus the DPF function of Ma depends heavily on the number of sensors in this case. Using Table I of [4], we have C , if L 2; , if L 3; , if L 4; , if L 10. For moderate values of γ, say γ 10 4, we have log γ 3.03, and so the right-hand side of 31 is roughly 37%, 68%, 96%, and 227%, respectively, if L 2, 3, 4 and 10. This indicates that Ma may perform poorly for moderate values of γ in symmetric systems with multiple sensors. For example, when L 4, the detection delay of Ma may be 96% larger than that of the optimal centralized procedure if γ Finally, let us compare Ma with Na. While Ma has better asymptotic performance than Na, it is possible that Ma has worse performance than Na in practical applications, especially when L, the number of sensors, is large but γ is only moderately large. To indicate this, note that the right-hand side of 31 could be larger than that of 29 if C log γ /I D Thus, if C is large or /I D is small, then it is likely that Ma will perform worse than Na for moderate values of γ. By 27, it is easy to see that if there are large number of sensors, then the value C will be very large, and so Ma can perform worse than Na. For instance, in the above symmetric Gaussian system with small SNRs, 32 becomes γ 50, if L 2; , if L 3; , if L 4; , if L 10. Therefore, for moderate values of γ, say 10 4, it is likely that Ma will perform worse than Na in the system with large number of sensors. Observe that both of Na and Ma do not use past message information or the feedback from the fusion center, but they are asymptotically optimal in the corresponding decentralized decision systems. This fact proves the following interesting result, part of which was conjectured in [24]: TABLE I TWO NONIDENTICAL SENSORS µ 1 0.2, µ 2 1 γ E 1 Na % DPF E 1 Ma % DPF E 1 T a ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.1 Numbers in parentheses are the values of a so that E τa γ. The decentralized penalty function DPF was based on the sampled values. TABLE II TWO IDENTICAL SENSORS µ 1 µ 2 1 γ E 1 Na % DPF E 1 Ma % DPF E 1 T a ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.0 Theorem 4: If all pre-change and post-change distributions are completely specified and satisfy the conditions of Theorems 1 and 3, then neither past message information nor the feedback from the fusion center improves asymptotic performance in the decentralized decision systems specified in 1-5. It should be pointed out that one of the underlying assumptions of this theorem is that the observations are independent from sensor to sensor. It is likely that past message information or the feedback will be more useful in practical applications where the observations are dependent or observation distributions are only partially specified. VI. NUMERICAL RESULTS In this section, we present a numerical illustration of the asymptotic theory of previous sections. Suppose there are L sensors each sending binary message to the fusion center, i.e., D l 2. Assume that the observations at sensor S l are independent and identically distributed random variables with mean 0 and variance 1 before the change and with mean µ l and variance 1 after the change. An interesting application of this model can be found in [21], where L geographically separated sensors are used to detect the appearance of a deterministic signal or target, which is contaminated by additive white Gaussian noise at each sensor. If µ l > 0, then the likelihood ratio at sensor S l is a monotonically increasing function of the observation, and hence the MLRQ at each

7 7 TABLE III THREE NONIDENTICAL SENSORS µ 1 µ 2 0.2, µ 3 1 γ E 1 Na % DPF E 1 Ma % DPF E 1 T a ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.1 Numbers in parentheses are the values of a so that E τa γ. The decentralized penalty function DPF was based on the sampled values. TABLE IV THREE IDENTICAL SENSORS µ 1 µ 2 µ 3 1 γ E 1 Na % DPF E 1 Ma % DPF E 1 T a ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.0 sensor S l can be written as U l,n { 1, Xl,n λ l ; 0, otherwise. Thus the Kullback-Leibler information number for U l,n is rλ l h Φλ l µ l, Φλ l, where Φ and ha, b are defined as in 30. Since the function rλ l has a unique maximum value over [0, ], it is easy to find the optimal λ l numerically. For example, if µ l 0.2 or 1, then the optimal thresholds λ l are and , respectively, and the corresponding optimal Kullback-Leibler information numbers rλ l are and , respectively. Note that in these situations, I 2 g l, f l /Ig l, f l is close to 2/π since the Kullback- Leibler information number Ig l, f l µ 2 l /2. As an illustration, six cases are considered: Case 1 Two Nonidentical Sensors: L 2, µ and µ 2 1. Case 2 Two Identical Sensors: L 2, and µ 1 µ 2 1. Case 3 Three Nonidentical Sensors: L 3, µ 1 µ 2 0.2, and µ 3 1. Case 4 Three Identical Sensors: L 3 and µ l 1 for l 1, 2, 3. Case 5 Ten Nonidentical Sensors: L 10, µ l 1 if l 1, 2, 3, and µ l 0.2 if 4 l 10. Case 6 Ten Identical Sensors: L 10 and µ l 0.2 for all 1 l 10. In each case, we compare three asymptotically optimal procedures: i Na, defined by 15 in the system with limited local memory; ii Ma, defined by 22 in the system with full local memory; and TABLE V TEN NONIDENTICAL SENSORS THREE µ l 1 AND SEVEN µ l 0.2 γ E 1 Na % DPF E 1 Ma % DPF E 1 T a ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.0 Numbers in parentheses are the values of a so that E τa γ. The decentralized penalty function DPF was based on the sampled values. TABLE VI TEN IDENTICAL SENSORS ALL µ l 0.2 γ E 1 Na % DPF E 1 Ma % DPF E 1 T a ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± 0.2 iii T a, Page s CUSUM procedure defined by 9 in the centralized version. For these three procedures τa, the threshold value a was first determined from the criterion E τa γ. Since E Na is discontinuous see [2], the values of γ were chosen so that the corresponding threshold value a exists for each of these procedures. A repetition Monte Carlo simulation was performed to determine the appropriate values of a to yield the desired mean time between false alarms γ to within the range of sampling error. Rather than simulating E τa for each a separately which is computationally demanding, an efficient algorithm, suggested by Professor Gary Lorden, is to run one simulation to return the record values of the CUSUM statistics and the corresponding values of sample size, and then to estimate E τa for different a based on these record values. Next, the renewal property of the CUSUM statistics implies that the detection delay E 1τ for each of these three procedures is just E 1τ, the expected sample size when the change happens at time ν 1. It is therefore straightforward to simulate the detection delay. Monte Carlo experiments with 10 4 repetitions yielded estimates for the detection delays. The results are summarized in Tables I - VI, with the values of a in parentheses. In the system with two sensors, Tables I and II show that Ma performs better than Na even for moderate γ in both nonsymmetric and symmetric systems. In the system with three sensors, Tables III and IV show that for moderate γ, Ma performs better than Na in a nonsymmetric system, but their performances are similar in a symmetric system. In the system with ten sensors, Tables V and VI show that Ma performs much worse than Na for moderate γ in both nonsymmetric and symmetric systems. These are consistent with our asymptotic theory.

8 8 It is interesting to see that the DPF function of Ma seems to be a decreasing function of γ, but the DPF function of Na seems to be an increasing function. Comparisons of Tables I-VI indicate that adding sensors with low SNRs actually degrades the performance of Ma for moderate values of γ, while adding sensors with relatively high SNRs will improve the performance of Ma for moderate values of γ, but the improvement may not be as good as those of two other procedures Na and T a. VII. CONCLUSIONS We have studied a decentralized extension of quickest change detection problems in two different scenarios. In the system with limited local memory, we have proved the previously conjectured asymptotic optimality of Page s CUSUM procedures with Monotone Likelihood Ratio Quantizers MLRQ under a new condition on observation distributions. The widely used Gaussian or exponential distributions satisfy this condition. In the system with full local memory, we have developed the first of asymptotically optimal procedures. A major theoretical result is that our procedures have same asymptotically first-order performances as the corresponding optimal centralized procedures, although both theoretical analysis and numerical simulations also show that our procedures may perform poorly in some practical situations, especially in the system with large number of sensors, because of the slow asymptotic convergence. It is interesting to note that all these asymptotically optimal decentralized procedures do not use past messages, and hence neither past message information nor the feedback from the fusion center improves asymptotic performance. Finally, we have compared these asymptotically optimal decentralized procedures with the optimal centralized procedures, especially for Gaussian sensor observations. There are a number of interesting problems which have not been addressed here. In practice, the distributions of sensor observations often involve unknown parameters. The results developed here are for completely known pre-change and post-change distributions, but they provide benchmarks and ideas for the development of procedures in the presence of unknown parameters. It is also of interest to study the system where the observations at the different sensors may be dependent. Moreover, finding fairly simple decentralized procedures which are not only asymptotically optimal, but have good performance for practical values of the mean time between false alarms, will undoubtedly be of great importance. Therefore, this work should be interpreted as a starting point for further investigation. ACKNOWLEDGEMENT The author thanks his advisor, Dr. Gary Lorden, for his constant support and encouragement, Dr. Venugopal V. Veeravalli for bringing this problem to his attention, as well as Dr. Alexander G. Tartakovsky for fruitful discussions. The author also thanks the referees for helpful suggestions which lead to significant improvements in organization and presentation. A. Proof of Theorem 1 APPENDIX PROOF OF THEOREMS In the system with limited local memory, we can rewrite U l,n ψ l,n X l,n, where ψ l,n may depend on U [1,n 1] U 1,[1,n 1],..., U L,[1,n 1]. Denote by f ψ l,n and gψ l,n respectively the conditional density induced on U l,n given U [1,n 1] when the density of X l,n is f l and g l. Denote by Z l,n the conditional log-likelihood ratio function of U l,n, log g ψ l,n U l,n/f ψ l,n U l,n. Since X 1,n,..., X L,n are independent, so are U 1,n,..., U L,n given U [1,n 1]. Thus in the fusion center, the conditional loglikelihood ratio of U 1,n,..., U L,n given U 1,[1,n 1] is Z n Z l,n. By Theorem 1 of Lai [8], in order to prove 16, it suffices to show that for any δ > 0, thus lim sup n ν 1 { ν+t ess sup P ν max tn kν Z k I D 1 + δn By the definition of I Dl g l, f l, for any k ν, E ν Zk U[1,ν 1] { ν+t P ν max tn kν ν+t { P ν max tn kν { P ν max tn U[1,ν 1] } E ν Zl,k U[1,ν 1] I Dl g l, f l I D, Z k I D 1 + δn U[1,ν 1] } } Z l,k E νz l,k I D δn U[1,ν 1] ν+t Z l,k E νz l,k δ 1n U[1,ν 1] }, kν where δ 1 I D δ/l. Note that ν+t Zl,k E kν ν Z l,k is a martingale under Pν, Doob s submartingale inequality tells us { ν+t } P ν max Z l,k E ν Z l,k δ 1 n U[1,ν 1] tn kν ν+n kν Eν Zl,k 2 U[1,ν 1]. δ1 2n2 By definition, E ν Zl,k 2 U[1,ν 1] VDl g l, f l for any k ν, and hence } Z l,k E νz l,k δ 1n U[1,ν 1] { ν+t P ν max tn kν VD l g l, f l δ 2 1 n, which implies 33 since V Dl g l, f l is finite. Relation 16 follows. B. Proof of Theorem 2 Assume that φy is a quantizer taking values in {0, 1}. Denote by f φ and g φ respectively the density of φy when the density of Y is f or g. Let Z φ log g φφy f φ φy. Note that when D 2, E gz φ 2 β φ log β φ α φ βφ log 1 β φ 1 α φ 2, where α φ P f φy 1 and β φ P g φy 1. Define Hr, s r log r 2 1 r 2, + 1 r log s 1 s

9 9 for 0 < r, s < 1 and H0, 0 H1, 1 0. To prove V 2g, f <, it suffices to show that there exists a constant M such that for any φ, Hβ φ, α φ < M. If one of α φ and β φ is 0 or 1, it is easy to see that Z φ is 0 with probability 1 under g, and hence Hβ φ, α φ 0. So it suffices to consider the case where 0 < α φ, β φ < 1. Since Hb, a H1 b, 1 a, assume without loss of generality that 0 < α φ β φ < 1. Otherwise consider 1 φy and use 18 instead of 17. Since 1 Bt is a cumulative distribution function and Bt is continuous by assumption, there exists t 0 0, such that Bt 0 β φ. Now let φ be the likelihood ratio quantizer defined by { φ 1 if gy /fy > t0; 0 otherwise. Then P f φ 1 At 0 and P g φ 1 Bt 0. The proof of Neyman-Pearson lemma [10, p. 65] shows that φ φ gy t 0fy dµ 0, so that Bt0 β φ t0 At0 α φ 0. Since Bt 0 β φ by our choice of t 0, we have Note that for fixed r, Hr, s s At 0 α φ. 2 [ 1 r 1 s log 1 r 1 s r s log r ], s which is positive for all s r. Thus Hr, s is a decreasing function of s in the interval [0, r]. In particular, Hβ φ, α φ Hβ φ, At 0 H Bt 0, At 0. Therefore, it suffices to show that there exists a constant M such that for all t, H Bt, At < M. Since At and Bt are continuous functions of t, it suffices to show that H Bt, At is bounded as t goes to 0 or. It is easy to see that if the likelihood ratio gy/fy has a positive lower bound C 0 > 0, then H Bt, At is 0 if t < C 0. So it suffices to consider the case when such a lower bound does not exist. Now Bt and At go to 1 as t goes to 0, so lim t 0 Bt Bt log 0. At By Wald s likelihood ratio identity, we have gy gy 1 Bt P g fy < t E f fy ; gy fy < t tp f gy fy < t t1 At. Using the fact that 1 At 1, we know that 1 Bt log 1 Bt 1 At is less than { } max 1 Bt log1 Bt, 1 Bt log t. 34 As t 0, Bt 1, so that the first term in equation 34 goes to 0, and by Chebyshev s inequality the square of the second term is log t 2 gy P g log > log t E g log gy 2, fy fy which is finite by the assumption. Hence and lim sup H Bt, At <. t 0 Similarly, it is clear that 1 Bt lim 1 Bt log 0, t 1 At lim sup t Bt Bt log At lim sup t is finite by the assumption in 17. Hence and Theorem 2 is proved. C. Proof of Theorem 3 lim sup H Bt, At <, t Bt log At To prove 24, define a new stopping time ˆMa inf { n n : log g lx l,i f l X l,i π la for all l 1, 2,..., L }, i1 By the relation between the one-sided sequential probability ratio tests and Page s CUSUM procedures, it is easy to see that E 1Ma E 1 ˆMa, 35 and so it suffices to show that 24 holds for E 1 ˆMa. To prove this, for 1 l L, let { n ˆM l inf n : log g } lx l,i f l X l,i π la, and i1 { τ l ˆM l sup n 1 : ˆM l +n i ˆM l +1 log g lx l,i f l X l,i 0 }. For simplicity, denote τ l τ l 0. It is well-known e.g., Theorem D in [6] that for any 1 l L, E 1 τ l < 36 since log g l X/f l X has positive mean and finite variance under P 1 by Assumption A2. By definition of ˆMl and τ l ˆM l, we have ˆMa max ˆMl + τ l ˆM l max ˆM l + τ l ˆM l. 1lL 1lL Now since X l,1, X l,2,... are i.i.d. under P 1, we have E 1τ l ˆM l E 1 τ l, and thus E 1 ˆMa E1 max ˆM l + E 1τ l. 37 1lL By renewal theory and Assumption A2, under P 1, E 1 ˆM l a + O1 and V ar 1 I ˆM l Oa, tot as a, see [16] and [17, p. 171]. Hence, E1 ˆMl a 2 E 1 I ˆM l a 2 tot V ar 1 ˆM l + E 1 ˆMl a 2 Oa,

10 10 and so Thus E 1 max 1lL E 1 ˆMl ˆM l a O a. a + E 1 max 1lL a + E 1 ˆMl a + O a. ˆMl a a Relation 24 follows at once from 35, 36 and 37. To prove 26, let A expa and note that n+1 E M P M n P M ndx A n1 n+1 n1 n n1 n P M xdx P M tadt. 1/A Thus by Lemma 6 below and Fatou s Lemma, lim inf E Ma/A a lim inf a lim inf a P Ma ta1{t 1 A }dt 1 P M xdx [P Ma ta1{t 1 A } ] dt exp tdt 1, and hence 26 holds. To complete the proof, we need to prove the following lemmas. Lemma 3: Let W l,n be the CUSUM statistic defined in 20. For any l, any k 1, 2,..., and any real number b, P Wl,n b exp b. Proof: For each l, let S l,n denote the log-likelihood ratio n logg i1 lx l,i /f l X l,i, and define S l,0 0. Then the CUSUM statistic takes the form W l,n max S l,n S l,k. 0k<n Since X l,1, X l,2,..., X l,n have the same joint distribution as X l,n, X l,n 1,..., X l,1, W l,n has the same distribution as max 1in S l,i. Thus, P W l,n b P max 1in S l,i b P t l b n, where t l b inf{n : S l,n b}. Lemma 3 follows from the fact that P t l b n P t l b < exp b. Lemma 4: For any k 1, 2,..., P Ma k 1/A, where A expa. Proof: Note that, since the observations are independent from sensor to sensor, application of Lemma 3 yields P Ma k P Wl,k π l a for 1 l L L P W l,k π l a L exp π l a exp a 1 A. Using Lemma 4, it is easy to derive Lemma 5: For any m 1, 2,..., Lemma 6: For t > 0, P Ma m m A. lim sup P Ma ta 1 exp t. 38 a Proof: For simplicity, we consider only the case when L 2. The same idea can be applied to the cases L 1 and L 3. Choose m ma such that m/a 2, and log m/a 0. Note that P Ma ta P max 0k<tA/m[ P max k max j max km+1jk+1m min max 1l2 i l [ ] W l,j min > a 1l2 ] π l S l,j S l,il > a, 39 π l where the maximum is taken over 0 k < ta/m, km + 1 j k + 1m and 1 i l j for all l 1, 2. For all such k, define C 1 k {i 1 : km + 1 i 1 j k + 1m}, C 2k {i 1 : 1 i 1 km}, D 1 k {i 2 : km + 1 i 2 j k + 1m}, D 2 k {i 2 : 1 i 2 km}. For simplicity, omit k, e.g., write C 1 for C 1 k, and define For r 1, 2, 3, 4, denote [ Q r P max max k j B 1 C 1 D 1, B 2 C 2 D 1, B 3 C 1 D 2, B 4 C 2 D 2. min 1l2 max B r ] S l,j S l,il > a, π l where the maximum is taken over 0 k < ta/m, km + 1 j k + 1m and i 1, i 2 B r. Note that the right-hand side of 39 is less than 4 Q r1 r, and hence it suffices to show that 4 lim sup Q r 1 exp t. a r1 It is easy to see that [ Q 1 1 P max j k min 1l2 max i l ] S l,j S l,il a π l where the product is taken over 0 k < ta/m, and the maximum is taken over km + 1 i l j k + 1m for all l 1, 2. Thus ta/m Q 1 1 P Ma > m. By Lemma 5, we have Q m A ta/m. Note that since m/a 0 as a, for given δ > 0, once a is sufficiently large, 1 m A exp 1 + δ m A,

11 11 and thus Q 1 1 exp 1 + δt. Letting δ 0, we obtain lim sup Q 1 1 exp t. a To complete the proof of Lemma 6, it suffices to show that for all ɛ > 0, Q 2, Q 3 and Q 4 are smaller than ɛ for sufficiently large a. We will prove this fact for Q 2 in Lemma 7. The proofs for Q 3 and Q 4 are similar. Lemma 7: Under the condition 25 of Theorem 3, for all ɛ > 0, once a is sufficiently large, [ ] Q 2 P max max min max S l,j S l,il > a ɛ, k j 1l2 i l π l where the maximum is taken over 0 k < ta/m, km + 1 j k + 1m, 1 i 1 km, and km + 1 i 2 j k + 1m. Proof: Note that j i 1 j km + km i 1 and S 1,j S 1,i1 equals to the sum of the independent random walks S 1,j S 1,km and S 1,km S 1,i1. Hence, if {S i } is an independent copy of {S 1,i }, then Q 2 ta m P max m S i + S 1,j > π 1 a and W 2,j > π 2 a 0itA ta m j1 m j1 t expπ1a m P max 0itA S i + S 1,j > π 1 a P W2,j > π 2 a m P max j1 0itA using Lemma 3 for W 2,j. Now using Wald s likelihood ratio identity, P max S i + S 1,j > π 1 a 0itA S i + S 1,j > π 1 a, P S 1,j > π 1a + P max Si > π1a S1,j > 0 0itA P S 1,j > π 1 a + E exps1,j π 1 a; π 1 a S 1,j > 0 E exp min0, S 1,j π 1a Thus, E 1 [ exp S1,j exp min0, S 1,j π 1a ]. Q 2 t m m E 1 exp minπ 1 a S 1,j, 0. j1 Applying Lemma 8 below for S 1,j under P 1, and letting m 1 a 2, we have for sufficiently large a sup E 1 exp minπ 1a S 1,j, 0 ɛ 1, j m 1 Therefore, Q 2 t m 1 m1 1 + m m 1ɛ 1 t m m + ɛ1, and the lemma follows, since the right-hand side goes to 0 as a goes to. Lemma 8: Suppose X 1, X 2,... are i.i.d. with EX i µ > 0, V arx i σ 2, and E X i 3 ρ <. Let S n X X n and m 1 b 2. Then sup E exp minb S n, 0 0, n m 1 as b. Proof: First we establish E exp minb S n, 0 3ρ b nµ σ 3 n + Φ σ + n b nµ +A σ n + σ n exp b + σ2 2 µn, 40 where Φx is the standard Gaussian distribution and Ax Φ x 1 Φx. Let F n x denote the distribution function of S n, then F nx Φ x nµ σ n 3ρ σ 3 n for any x by the Berry-Esseen Theorem. Now E exp minb S n, 0 F n b + b b b b expb xdf n x F n x expb xdx 3ρ x nµ σ 3 n + Φ σ n 3ρ x nµ σ 3 n + Φ σ n φ x nµ σ n + expb xdx expb x 1 σ n dx, and hence 40 holds. We next bound each term on the right-hand side of 40. For n m 1, the first two terms are uniformly bounded by 3ρ 1 µb σ 3 b + Φ σ which goes to 0 as b. For the third term on the right-hand side of 40, we need to consider two cases: 1 µ > σ 2 /2; and 2 µ σ 2 /2. In case 1, note that Ax 1, and so for all n m 1, the third term is smaller than exp b µ σ2 2 b2, which goes to 0 as b. In case 2, note that Ax φx/x for all x > 0, where φx is the density function of the standard Gaussian distribution see [26, p. 141]. Thus the third term is smaller than, σ n b + σ 2 µn φ b µn σ n which also goes to 0 uniformly for all n m 1 as b. Therefore, Lemma 8 holds. REFERENCES [1] M. Basseville and I. Nikiforov, Detection of Abrupt Changes: Theory and Applications. Englewood Cliffs: Prentice-Hall, [2] R. W. Crow and S. C. Schwartz, Quickest detection for sequential decentralized decision systems, IEEE Trans. Aerosp. Electron. Syst., vol. 32, pp , Jan [3] V. P. Dragalin, Asymptotics for a sequential selection procedure, Statistics and Decisions, pp , Suppl. Issue no. 4. [4] V. P. Dragalin, A. G. Tartakovsky, and V. V. Veeravalli, Multihypothesis sequential probability ratio tests part II: Accurate asymptotic expansions for the expected sample size, IEEE Trans. Inform. Theory, vol. 46, pp , [5] W. Feller, An Introduction to Probability Theory and Its Applications, Vol. II. John Wiley & Sons, [6] J. Kiefer and J. Sacks, Asymptotically optimal sequential inference and design, Ann. Math. Statist., vol. 34, pp , [7] T. L. Lai, Sequential change-point detection in quality control and dynamical systems, J. Roy. Statist. Soc. Ser. B, vol. 57, pp , [8] T. L. Lai, Information bounds and quick detection of parameter changes in stochastic systems, IEEE Trans. Inform. Theory, vol. 44, pp , 1998.,

Asymptotically Optimal Quickest Change Detection in Distributed Sensor Systems

Asymptotically Optimal Quickest Change Detection in Distributed Sensor Systems This article was downloaded by: [University of Illinois at Urbana-Champaign] On: 23 May 2012, At: 16:03 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954