Stabilization over discrete memoryless and wideband channels using nearly memoryless observations

Stabilization over discrete memoryless and wideband channels using nearly memoryless observations Anant Sahai Abstract We study stabilization of a discrete-time scalar unstable plant over a noisy communication link, where the plant is perturbed at each time by an unknown, but bounded, additive disturbance. While allowing unbounded computational complexity at the decoder/controller, we restrict the observer/encoder to be nearly memoryless in that the channel inputs can only depend on the last sample taken of the plant state. A time-varying randomized encoding scheme is given for discrete memoryless channels, while a timeinvariant deterministic scheme is used for a peak-power constrained wideband channel. These schemes are shown to achieve stabilization in an η-moment sense. U(t 1) 1 Step Delay W (t 1) Scalar System X(t) U(t) Control Signals Encoder/ Observer O Decoder/ Controller C Noisy Channel I. INTRODUCTION The problem of control in the presence of communication constraints is illustrated in figure 1. This is a problem of cooperative distributed control in that there are two different boxes that need to be designed: the observer/encoder that has access to the plant state and generates an input to the noisy channel, and the controller which has access to the channel outputs and applies a control to the plant. The reader is referred to two resources to get the appropriate background. The first is the classic 1978 paper by Ho, Kastner, and Wong [5] in which they summarize the then known relationships among team theory, signaling, and information theory from the perspective of automatic control. The more recent interest in the subject is best understood by reading the recent September 2004 special issue of Transactions on Automatic Control and the references contained therein. 1 A. Sahai is with the department of Electrical Engineering and Computer Sciences at the University of California at Berkeley sahai@eecs.berkeley.edu 1 Due to space limitations, I only cite references specifically helpful for understanding the present work, not for giving full and proper historical background. Fig. 1. Control over a noisy communication channel without any explicit feedback path from controller to observer. Here, we focus exclusively on the case shown in figure 1 where the information patterns [10] are non-nested in that the observer does not know everything the controller knows. This means that there is no explicit feedback of the channel outputs to the observer. Our previous work [7] showed that in the case of finite alphabet channels, the plant itself could be used as a feedback channel for communication. The controller can make the plant dance so as to noiselessly communicate the channel output to the encoder/observer. This showed that in principle, the non-nested information pattern did not increase the difficulty of the stabilization problem. [7] further showed that the problem of stabilization over a noisy channel was equivalent to a problem of anytime communication using the noisy channel with perfect feedback. While this equivalence certainly illuminates the structure of distributed control problems (see [9] for more details), it does not actually allow us to solve any particular problem. This is espe-

cially true since the previously known generic constructions for anytime codes are based on infinite random trees. As such, they require growing complexity at both the encoder and decoder. This contrasts in spirit with the idea of stabilization making a system behave in a nice steady-state manner that returns to the same few safe states over and over again. If the closed-loop system is behaving nicely, why should the observer and controller have a steadily increasing workload? In this paper, we attack this problem by making a structural requirement on the observer/encoder it must be a nearly memoryless map (though possibly time-varying and random in nature) that picks channel outputs based only on the current state of the plant. As such, we are aiming to generalize the nicely matched situation that exists when stabilizing a scalar linear plant over an average power-constrained additive white Gaussian noise (AWGN) channel.[1] In that special case, memoryless time-invariant observers (scalar gains) suffice. In section II, we formally introduce the problems of stabilization and anytime communication and review the existing results from [7] and [8]. In section III, we state and prove the main result of this paper a sufficient condition for stabilization using nearly memoryless random time-varying encoder/observers over discrete memoryless channels and a parallel sufficient condition for stabilization using a nearly memoryless time-invariant deterministic encoder/observer for the wideband channel. Finally, in section IV, we comment on the results and the wideband channel construction. In particular, we argue why the power-spectral density of our nominally infinite-bandwidth construction is actually somewhat reasonable. II. PROBLEM DEFINITION AND REVIEW A. Stabilization Problem At the heart of the stabilization problem is a scalar plant that is unstable in open loop. Expressed in discrete-time, we have: X(t + 1) = λx(t) + U(t) + W (t), t 0 (1) where {X(t)} is a IR-valued state process. {U(t)} is a IR-valued control process and {W (t)} is a bounded noise/disturbance process s.t. W (t) Ω 2 a.s. λ 1 so the system is unstable while X(0) = 0 for convenience. 2 The distributed nature of the problem comes from having a noisy communication channel in the feedback loop. We require an observer/encoder system O to observe X(t) and generate inputs A(t) to the channel. We also require a decoder/controller system C to observe channel outputs B(t) and generate control signals U(t). Unless otherwise specified, we allow both O, C to have arbitrary memory and to be nonlinear in general. Definition 2.1: A closed-loop dynamic system with state X(t) is η-stable if there exists a constant K s.t. E[ X(t) η ] K for all t 0. Holding the η-moment within bounds is a way of keeping large deviations rare. 3 The larger η is, the more strongly we penalize very large deviations. B. Anytime communication For simplicity of notation, let M i be the R bit message that a channel encoder takes in at time i. Based on all the messages received so far, and any additional information (e.g. past channel output feedback) that it might have, the channel encoder emits the i-th channel input. An anytime decoder provides estimates ˆMi (t), the best estimate for message i at time t. If we are considering a delay d, the probability of error we are interested in is P ( ˆM t d (t) M t d ). Definition 2.2: The α-anytime capacity C anytime (α) of a channel is the least upper bound of the rates at which the channel can be used to transmit data so that there exists a uniform constant K such that for all d and all times t we have 4 P ( ˆM t d (t) M t d (t)) K2 αd The probability is taken over the channel and any randomness that we deem the encoder and 2 Just start time at 1 if you want an unknown but bounded initial condition. 3 [9] shows how to map the results given here to almost-sure stabilization in the undisturbed case when W (t) = 0. 4 We could alternatively have bounded the probability of error by 2 α(d log 2 K) and interpreted log 2 K as a minimum required known delay.

bit-slot 1 bit-slot 2 bit-slot 3 bit-slot 4 0 2 3 4 Example waveform for 0,1,0,1,1 E b 2 k 0 0, 1 0, 1, 0 0, 1, 0, 1 0, 1, 0, 1, 1 Fig. 3. The first few members of a countable set of basic codefunctions on [0, ] that are all orthogonal to each other. These represent windowed sinusoids at distinct frequencies. Fig. 2. Repeated pulse position modulation illustrated. The time slots are on the top and the possible sub-slots are on the bottom. decoder to have access to. Due to the fast convergence of an exponential, it is equivalent to requiring that the probability of error for all messages sent upto time t d is bounded by K 2 αd. In [8], we studied the special case of the infinite bandwidth AWGN channel with an average power constraint but no feedback. Since the channel is continuous-time, it is up to us how to discretize it. We chose a = 1 > 0 so that exactly 1 new R bit arrives at the encoder every seconds. For that case, we defined a coding strategy called repeated pulse position modulation illustrated in figure 2. This coding strategy sent a single pulse with energy E b whose position in the next seconds represented the value of all the bits received up until this point. The decoding of the bits so far proceeded by maximum likelihood as applied to the entire received channel output so far. Theorem 2.3: [8] The repeated pulse position modulation code under maximum-likelihood decoding for individual bits with delay d achieves the orthogonal coding error exponent for every delay and bit position. P ( ˆM i (i + d) M i ) K 2 de orth(r) where C = P N 0 log 2 e and: (2) ( C R) if 0 R C 2 4 E orth (R) = ( C R) 2 if C < R < C 4 0 otherwise (3) where P is the average transmit power and N 0 represents the noise intensity. Thus the energy per bit E b > N 0 ln 2 for communication to proceed. The only property of repeated PPM required to prove this result was captured by this lemma: Lemma 2.1: [8] The repeated PPM code is semi-orthogonal. If a is the waveform corresponding to the bitstream M, and a is the waveform corresponding to the bitstream M, then a([(j 1), T ] is orthogonal to a ([(j 1), T ] whenever there exists a bit position i j for which M i M i. The only property of time-disjoint pulses that we needed was that they formed an orthogonal family of functions. We could just as well use another family of orthogonal functions. For example, consider frequency disjoint pulses as depicted in figure 3. g i, (t) = 2 2 0 if t < 0 4πi sin( t) if 0 t, i 0 sin( 2π( 2i 1) t) if 0 t, i < 0 0 if t > (4) The g i, family is also orthogonal and can be used to make a semiorthogonal code with the added advantage of meeting a strict amplitude (or peak power) constraint. C. Anytime communication and stabilization Theorem 2.4: [7] For a given noisy channel, bound Ω, and η > 0, if there exists an observer/encoder O and controller/decoder C for the unstable scalar system that achieves E[ X(t) η ] < K for all bounded driving noise Ω 2 W (t) Ω 2, then C anytime (η log 2 λ) log 2 λ bits per channel

use for the noisy channel considered with noiseless feedback. And for the non-nested information pattern case: Theorem 2.5: [7] It is possible to control an unstable scalar process driven by a bounded disturbance over a noisy finite output-alphabet channel so that the η-moment of X(t) stays finite for all time if the channel with feedback has C anytime (α) log 2 λ for some α > η log 2 λ if the observer is allowed to observe the state X(t) exactly. The encoders and decoders used in the proofs of these theorems were far from simple. Our goal in the next section will be to consider simplified encoders. III. MAIN RESULT We first introduce what we mean by a nearly memoryless observer/encoder. Then we state the main result, followed by a sketch of its proof. A. Nearly memoryless observer/encoders By nearly memoryless observers, we mean O functions that sample the plant state every T time units (for some T > 0), and then apply a channel input that depends only on the last such sample. Definition 3.1: A random nearly memoryless observer is a sequence of maps O t such that there exist O t so that: O t (X t 0, Z t 0) = O t(x(t t T ), Z t T ) where the Z i are the iid continuous uniform random variables on [0, 1] that represent the commonrandomness available at both the observer/encoder and the controller/decoder. Definition 3.2: A time-invariant deterministic nearly memoryless observer is a sequence of maps O t such that there exist O so that: O t (X t 0, Z t 0) = O (X(T t T )) where the Z i are the iid continuous uniform random variables on [0, 1] that represent the commonrandomness available at both the observer/encoder and the controller/decoder. B. Main theorems and proof Theorem 3.3: We can η-stabilize an unstable scalar process driven by a bounded disturbance over a discrete memoryless channel if the channel without feedback has random block-coding error exponent E r (R) > η log 2 λ for some R > log 2 λ and the observer is only allowed access to the plant state where E r (R) = max P A ( max ρr log 2 ρ [0,1] b a ) 1+ρ P A (a)p 1 1+ρ ab where p ab represents the probability of an a b transition in the discrete memoryless channel. Furthermore, there exists a T > 0 so this is possible by using an nearly memoryless random encoder/observer consisting of a time-varying random scalar quantizer that samples the state every n time steps and outputs a random label for the bin index. This random label is chosen iid from the channel input alphabet A T according to the distribution that maximizes the random coding error exponent at R. The controller is assumed to have access to the common randomness used to choose the the random bin labels. and in the case of the wideband channel: Theorem 3.4: We can η-stabilize an unstable scalar process driven by a bounded disturbance over a continuous-time wideband AWGN channel with average power constraint P if E orth (R) > η log 2 λ for some R > log 2 λ and the observer is only allowed access to the plant state. Furthermore, there exists a T > 0 so this is possible by using a time-invariant deterministic nearly memoryless encoder/observer consisting of a scalar quantizer Q that samples the state every T time steps and outputs the function P T gq( X( t T )),T (t T t ) into the channel. T Proof of both theorems: Pick a rate R > log 2 λ for which E r (R) > η log 2 λ. Pick T, large enough so that: T R > log 2 (λ T i=0 + λt i Ω ) This means that if X(t) is known to be within any given box of size, then with no controls applied, the plant state X(t+T ) can be in no more than 2 T R adjacent boxes each of size. It is clear that such a T, pair always exists as long as R > log 2 λ.

Our quantizer Q will look at the plant state X to a coarseness. It is clear that we satisfy the following: a. Knowing that X(t) is in a given bin of width and assuming that no controls are applied, there are at most 2 T R possible bins which could contain X(t + T ). b. The descendants of a given bin dt time units later are all in contiguous bins and furthermore, there exists a constant K such that the total length covered by these bins is Kλ dt. Properties [a] and [b] above easily extend to the case when the control sequence between time X(t) and X(t+T ) is known exactly since linearity tells us that the impact of these controls is just a translation of the all the bins by a known amount. Thus, we have: c. Conditioned on actual past controls applied, the set of possible paths that the states X(0), X(T ), X(2T ),... could have taken through the quantization bins is a subset of a trellis that has a maximum branching factor of 2 T R Furthermore, the total length covered by the d-stage descendants of any particular bin is bounded above by Kλ dt. Not all such the paths through the trellis are necessarily possible, but all possible paths do lie within the trellis. Figure 4 shows what such a trellis looks like and figure 5 shows its tree-like local structure. In the case of a DMC, the observer/encoder just uses the common randomness to independently assign symbols from A T as time-varying labels to the bins. The probability measure used to draw the symbols should be the E r (R) achieving distribution. Thus, the labels on each bin are iid through both time and across bins. In the case of the wideband channels, the g i,t function is used with i being the integer quantization bin number of the past observed state to a coarseness of. In this case, it is clear that different states result in orthogonal channel inputs. Call two paths through the trellis disjoint with depth d if their last common node was at depth d and the paths are disjoint after that. We immediately observe: d. If two paths are disjoint in the trellis at a depth of d, then the channel inputs corresponding to t=0 t=1 t=2 t=3 t=4 7 7 7 7 7 7 7 7 7 7 7 R = log 2 3 Fig. 4. A short segment of the randomly labeled regular trellis from the point of view of the controller that knows the actual control signals applied in the past. The example has R = log 2 3 and λ 2.4 with large. the past dt channel uses are independent of each other in the DMC case and are orthogonal to each other in case of the wideband channel. Our controller just searches for the ML path through the trellis. The trellis itself is constructed based on the controller s memory of all past applied controls. Once we have the ML path, a control signal is applied based on the estimated state bin at the end of the ML path. This control signal is designed to drive the center of the box to zero in the next time step. Fix a time t and consider an error event at depth d. This represents the case that the maximum likelihood path last intersected with the true path dt time steps ago. By property [c] above, our control will be based on a state estimate that can be at most Kλ dt away from the true state. Thus, we have: e. If an error event at depth d occurs at time t, the state X(t+T ) is smaller than K λ (d+1)t for some constant K that does not depend on d or t. By property [c], there are no more than 2 dt R possible disjoint false paths that last intersected the true path d stages ago. By the memorylessness of the channel, the log-likelihood of each path is

using E orth for the wideband channel. All that remains is to analyze the η-moment by combining [g] and [e]. Fig. 5. Locally, the trellis looks like a tree with the nodes corresponding to the intervals where the state might have been and the levels of the tree correspond to the time. It is not a tree because paths can remerge, but all labels on disjoint paths are chosen so that they are independent of each other. the sum of the likelihood of the prefix of the path leading up to d stages ago and the suffix of the path from that point onward. For a path that is disjoint from the true path at a depth of d to beat all paths that end up at the true final state, the false path must have a suffix log-likelihood that at least beats the suffix log-likelihood of the true path. Property [d] tells us that the channel inputs corresponding to the false paths are pairwise independent of the true inputs for the past dt channel uses in case of the DMC, and orthogonal from the true path in case of the wideband channel. All we require to apply Gallager s random block-coding analysis of Chapter 5 in [4] is such a pairwise independence between the true and false codewords for a code of length dt. Similarly, all we require to apply Gallager s orthogonal coding analysis of Chapter 7 in [4] is such a pairwise orthogonality. So we have: f. The probability that the ML path diverges from the true path at depth d is no more than L2 dt Er(R) in case of the DMC, and no more dt E than L2 orth(r) in case of the wideband channel for some L > 0. Taking a normal union bound over all depths d results in a geometric sum which converges giving us: g. The probability that the ML path diverges from the true path at depth d is no more than K 2 dt Er(R) for some constant K > 0 that does not depend on d or t and similarly E[ X(t + T ) η ] t T (K 2 dt Er(R) )(K λ (d+1)t ) η d=0 < K (K λ T ) η 2 dt Er(R) )λ ηdt d=0 = K (K λ T ) η = K < d=0 2 dt (Er(R) η log 2 λ) where the final geometric sum converges since E r (R) > η log 2 λ. And similarly for E orth in the case of the wideband channel. The decoder analysis given here is for ML decoding. However, the classical analysis of Forney in [2] and [3] suggests that an idea corresponding to sequential decoding for a trellis should function in the decoder/controller. This approximate decoding algorithm will only cost a higher constant factor K in performance since it achieves the same exponents as ML decoding. Using sequential decoding, the computational effort 5 expended at the decoder is now a random variable that does not need to grow with t. Finally, [6] tells us that sequential decoding s computational effort must have a Pareto distribution with an infinite mean as we get close to the channel capacity. However, in the control context, we are probably aiming for a reasonably large η, in which case we will be far enough away from capacity to give us a finite mean for computational effort. IV. COMMENTS AND DISCUSSION It is interesting to consider the actual bandwidth used by the code for the wideband channel. In [8], the refining PPM code actually ended up using 5 The memory use at the decoder/controller will increase with time, as we must store the past channel outputs just in case the sequential decoder needs to backtrack. However, this is not likely to be a problem in modern systems since we could easily store many minutes of channel outputs without trouble. If the decoder ever has to back up that much, the model would have long since broken down for other reasons.

an infinite amount of bandwidth in a nontrivial way. However, in the control context, we make the following observation: if the system is stable, the state is steadily returning to the neighborhood of the origin. The channel inputs corresponding to those bins all have low frequencies in them. Thus, the probability of transmitting a signal with a high frequency component in it is bounded in a manner similar to the state. By Markov s inequality, we have: P ( X > x) = P ( X η > x η ) E[ X η ]x η Kx η By the construction of g i,t, we have that: P (f t > f) P (X(t) > ft 2 ) K f η This tells us that if we want the power spectral density 6 of A(t) to die off as 1, then all we need f 2 to do is ensure that the plant state X has a finite second moment. When viewed on a log-log plot, the power spectrum will have a tail that looks roughly linear for the low-power, high frequency part. This suggests that variations of such a control scheme might actually be practical since truly infinite bandwidth is never available in practice. In fact, for control systems involving wireless links in an ultrawideband scenario, the potentially large bandwidth assumption but hard amplitude constraint model used here is probably much more realistic than the usual linear Gaussian model. REFERENCES [1] R. Bansal and T. Basar, Simultaneous Design of Measurement and Control Strategies for Stochastic Systems with Feedback. Automatica, Volume 25, No. 5, pp 679-694, 1989. [2] G.D. Forney, Jr., Convolutional codes II. Maximumlikelihood decoding. Inform. and Control, vol. 25 pp. 222-266, 1974. [3] G.D. Forney, Jr., Convolutional codes III. Sequential decoding. Inform. and Control, vol. 25 pp. 267-297, 1974. [4] R.G. Gallager, Information Theory and Reliable Communication. New York, NY: John Wiley and Sons, 1971. [5] Y.C. Ho, M.P. Kastner, and E. Wong, Teams, Signaling, and Information Theory, IEEE Transactions on Automatic Control, April 1978, pp 305-312. [6] I.M. Jacobs and E.R. Berlekamp, A lower bound to the distribution of computation for sequential decoding, IEEE Transactions on Information Theory, vol. 13, pp. 167-174, April 1967. [7] A. Sahai, The necessity and sufficiency of anytime capacity for control over a noisy communication link, Proceedings of the 43rd IEEE Conference on Decision and Control, December 2004. [8] A. Sahai, Anytime coding on the infinite bandwidth AWGN channel: A sequential semi-orthogonal code Proceedings of the 39th Annual Conference on Information Sciences and Systems, March 2005. [9] A. Sahai and S.K. Mitter, The necessity and sufficiency of anytime capacity for control over a noisy communication link: Part I To be submitted to the IEEE Transactions on Information Theory, April 2005. [10] H. Witsenhausen, Separation of Estimation and Control for Discrete Time Systems. Proceedings of the IEEE, Vol 59, No. 11, November 1971. V. ACKNOWLEDGMENTS The author thanks students Amirali Zohrenejad and Paul S. Liu for their initial simulations and study of the λ = 1 case with random memoryless encoders. 6 No such power spectral density truly exists because the A(t) is non-stationary due to the period T. However, this is only a formality that can be eliminated by considering A(t) as an approximately cyclostationary process and using the appropriate technical machinery.