Similar documents
BLIND DECONVOLUTION ALGORITHMS FOR MIMO-FIR SYSTEMS DRIVEN BY FOURTH-ORDER COLORED SIGNALS

CS 294-2, Visual Grouping and Recognition èprof. Jitendra Malikè Sept. 8, 1999

Bayesian Methods for Machine Learning

GENERALIZED DEFLATION ALGORITHMS FOR THE BLIND SOURCE-FACTOR SEPARATION OF MIMO-FIR CHANNELS. Mitsuru Kawamoto 1,2 and Yujiro Inouye 1

A Canonical Genetic Algorithm for Blind Inversion of Linear Channels

Blind Deconvolution by Modified Bussgang Algorithm

256 Facta Universitatis ser.: Elect. and Energ. vol. 11, No.2 è1998è primarily concerned with narrow range of frequencies near ærst resonance èwhere s

I signal (or seismic source) is sent to probe the layered

STA205 Probability: Week 8 R. Wolpert

Analytical solution of the blind source separation problem using derivatives

Improved system blind identification based on second-order cyclostationary statistics: A group delay approach

Discrete time processes

Gaussian processes. Basic Properties VAG002-

On Moving Average Parameter Estimation

Chapter 3 - Temporal processes

X t = a t + r t, (7.1)

Estimation of the Optimum Rotational Parameter for the Fractional Fourier Transform Using Domain Decomposition

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Dynamic models. Dependent data The AR(p) model The MA(q) model Hidden Markov models. 6 Dynamic models

Blind Identification of FIR Systems and Deconvolution of White Input Sequences

7. Forecasting with ARIMA models

If we want to analyze experimental or simulated data we might encounter the following tasks:

Stat 516, Homework 1

Module 9: Stationary Processes

EXACT MAXIMUM LIKELIHOOD ESTIMATION FOR NON-GAUSSIAN MOVING AVERAGES

STA 4273H: Statistical Machine Learning

Bayesian Methods with Monte Carlo Markov Chains II

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only

LECTURE 17. Algorithms for Polynomial Interpolation

On Information Maximization and Blind Signal Deconvolution

Asymptotics and Simulation of Heavy-Tailed Processes

A Subspace Approach to Estimation of. Measurements 1. Carlos E. Davila. Electrical Engineering Department, Southern Methodist University

Probabilistic and Bayesian Machine Learning

System Identification, Lecture 4

CIFAR Lectures: Non-Gaussian statistics and natural images

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

STA 4273H: Statistical Machine Learning

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

An Iterative Blind Source Separation Method for Convolutive Mixtures of Images

Computation of Information Rates from Finite-State Source/Channel Models

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring

Observers. University of British Columbia. estimates can both be determined without the. èubcè maglev wrist ë6ë support the analysis and are

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Sequential Monte Carlo Methods for Bayesian Computation

STOCHASTIC PROCESSES Basic notions

3.4 Linear Least-Squares Filter

Recursive Generalized Eigendecomposition for Independent Component Analysis

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

LINEAR parametric models have found widespread use

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Markov Chain Monte Carlo (MCMC)

Introduction to Spatial Data and Models

IN HIGH-SPEED digital communications, the channel often

Semi-Parametric Importance Sampling for Rare-event probability Estimation

BLIND SEPARATION OF INSTANTANEOUS MIXTURES OF NON STATIONARY SOURCES

consistency is faster than the usual T 1=2 consistency rate. In both cases more general error distributions were considered as well. Consistency resul

Expectation propagation for signal detection in flat-fading channels

Nonparametric Bayesian Methods (Gaussian Processes)

10-704: Information Processing and Learning Fall Lecture 9: Sept 28

SAMPLING JITTER CORRECTION USING FACTOR GRAPHS

Statistics of stochastic processes

CCNY. BME I5100: Biomedical Signal Processing. Stochastic Processes. Lucas C. Parra Biomedical Engineering Department City College of New York

NUMERICAL COMPUTATION OF THE CAPACITY OF CONTINUOUS MEMORYLESS CHANNELS

Brief introduction to Markov Chain Monte Carlo

STA414/2104 Statistical Methods for Machine Learning II

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Statistical Inference and Methods

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Independent Component Analysis. Contents

MMSE DECODING FOR ANALOG JOINT SOURCE CHANNEL CODING USING MONTE CARLO IMPORTANCE SAMPLING

problem of detection naturally arises in technical diagnostics, where one is interested in detecting cracks, corrosion, or any other defect in a sampl

CS281A/Stat241A Lecture 22

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

MANY papers and books are devoted to modeling fading

Time Series Analysis -- An Introduction -- AMS 586

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Introduction to Spatial Data and Models

RESEARCH ARTICLE. Online quantization in nonlinear filtering

Continuous State MRF s

Basic math for biology

Bayesian Inference for DSGE Models. Lawrence J. Christiano

2 Particle ælters 2. The deænition of particle ælters Particle ælters are the class of simulation ælters which recursively approximate the æltering ra

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

System Identification, Lecture 4

TIME SERIES AND FORECASTING. Luca Gambetti UAB, Barcelona GSE Master in Macroeconomic Policy and Financial Markets

Introduction to Spectral and Time-Spectral Analysis with some Applications

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models

Name of the Student: Problems on Discrete & Continuous R.Vs

New Lagrange Multipliers for the Blind Adaptive Deconvolution Problem Applicable for the Noisy Case

Control Variates for Markov Chain Monte Carlo

Statistics & Data Sciences: First Year Prelim Exam May 2018

Linear Dynamical Systems

Density Estimation: ML, MAP, Bayesian estimation

ECON 616: Lecture 1: Time Series Basics

Expressions for the covariance matrix of covariance data

Optimal Mean-Square Noise Benefits in Quantizer-Array Linear Estimation Ashok Patel and Bart Kosko

STA 4273H: Statistical Machine Learning

Transcription:

Proc. th Asilomar Conf. Signals, Syst. Comput., Pacific Grove, CA, Nov. 1-, 199 10 Blind Deconvolution of Discrete-Valued Signals Ta-Hsin Li Department of Statistics Texas A&M University College Station, Texas 8 Abstract This paper shows that when the input signal to a linear system is discrete-valued the blind deconvolution problem of simultaneously estimating the system and recovering the input can be solved more eæciently by taking into account the discreteness of the input signal. Two situations are considered. One deals with noiseless data by an inverse-æltering procedure which minimizes a cost function that measures the discreteness of the output of an inverse ælter. For noisy data observed from FIR systems, the Gibbs sampling approach is employed to simulate the posteriors of the unknowns under the assumption that the input signal is a Markov chain. It is shown that in the noiseless case the method leads to a highly eæcient estimator for parametric systems so that the estimation error decays exponentially as the sample size grows. The Gibbs sampling approach also provides rather precise results for noisy data, even if the initial and transition probabilities of the input signal and the variance of the noise are completely unknown. 1. Introduction Blind deconvolution in general deals with the simultaneous estimation of a linear system fs j g and reconstruction of its random input fx t g on the basis of the data fy t g obtained from the convolution 1X y t = s j x t,j : j=,1 è1è Partial information about the statistical properties of fx t g is usually required in order to obtain a sensible solution. It is evident that how well the knowledge of fx t g can be incorporated into the solution plays an important role in this problem. The current paper is concerned with a special problem of blind deconvolution in which the input signal takes discrete values from a known alphabet a typical situation encountered frequently in digital communications ë8ë. Based on the inverse æltering approach, a cost function is employed to measure the closeness of the æltered data to a discrete-valued sequence and minimized to obtain an estimate for the unknown system. In the parametric case where the system is characterized by a ænite dimensional parameter èe.g., ARMA modelsè the method is proved to yield highly eæcient estimates so that the estimation error may decay exponentially as the sample size grows. When the data fy t g are contaminated by Gaussian white noise and the system fs j g has ænite length èfirè, the current paper shows that the Gibbs sampling procedure can be used to deal with the estimation of fs j g and fx t g under the assumption that fx t g is a Markov chain ëë. This method presents an avenue to incorporate colored input signals into the blind deconvolution problem. All these results provide yet another piece of evidence that a digital signal is capable of resisting distortion and contamination if its discreteness can be judiciously utilized in the restoration procedure.. An Inverse Filtering Procedure When fx t g is an i.i.d. sequence of zero-mean random variables and fs j g is a minimum-phase ARMAèp; qè system so that px a æ j y t,j = b æ j x t,j èè with a æ 0 = b æ 0 = 1, the classical least-squares method ë1ë provides a solution to the problem by seeking the coefæcients fa1;::: ;a p ;b1;::: ;b q gto minimize the sample variance of the linear prediction error fu t g given by px u t = a j y t,j, b j u t,j : j=1 èè Since the method approximates the maximum likelihood estimation by ignoring the end-point eæect, it is not surprising that minimizing the sample variance of fu t g leads to asymptotically eæcient estimates ë1ë for the ARMA system èè. An alternative to least squares is the method of moments. Although computationally

appealing, it does not however provide eæcient estimates except for the pure AR systems. The variance of the estimates in both methods is usually proportional to the reciprocal of the sample size, i.e., Oè1=nè. To generalize the idea of least squares, it is crucial to observe that fu t g in èè is the output of an inverse ælter corresponding to the ARMA system in èè. Therefore, the least-squares method calls for the minimization of variance of the output sequence obtained by æltering the data fy t g with an inverse ælter. For an arbitrary parametric system with s j = s j èç æ è in è1è, one may consider the output sequence u t èçè = 1X j=,1 s,1 j èçè y t,j èè where fs,1 j èçèg is the inverse of fs j èçèg. Since the variance alone is no longer suæcient for the discrimination of nonminimum-phase systems, higher-order moments of fu t g have tobe involved in the selection of optimal ælters ëë, ëë, ëë, ë9ë, ë10ë. Minimization of Eèju t j k,r k è with r k = Eèjx t j k è=eèjx t j k è and ké1, for example, was suggested in ëë, whereas maximization of jc k èu t èj=ècèu t èè k= with c k èu t è being the k-th order cumulant ofu t for kéwas discussed in ëë, ëë. The stationarity of fx t g is a crucial requirement in all these procedures, and many of them further require some moments of fx t g to be available. The estimation accuracy of these procedures is usually Oè1=nè ëë. This accuracy limit, however, can be signiæcantly improved when the discreteness of the input signal is taken into account. In fact, for an m-ary signal whose alphabet is A = fa i ;i=1;::: ;mg, a highly eæcient estimator can be obtained by minimizing ^J n èçè = 1 n+1 nx t=,n my j^u t èçè, a i j ; i=1 where f^u t èçèg results from the inverse æltering ^u t èçè = nx j=,n s,1 t,j èçèy j èè èè using only the observed data fy t ;t =,n;::: ;ng èassuming y t = 0 for all jtj é nè. This criterion measures the closeness of f^u t èçèg from being an A- valued discrete sequence. It can be shown ëë, ëë that the minimizer of ^Jn èçè, denoted by ^ç n, is a consistent estimator for the true parameter ç æ and, more importantly, that the estimation error k^ç n, ç æ k is bounded by the tail behavior of the true inverse system so that X k^ç n, ç æ kçc js,1 j èç æ èj èè jjjçn where cé0 is a constant. For ARMA systems, this implies that the error of ^ç n decays as an exponential function rather than the square-root reciprocal of the sample size n. In other words, minimization of ^Jn èçè would produce ësuper-eæcient" estimates for the blind deconvolution problem. If the system is autoregressive with ænite order, the super-eæciency yields lim n!1 Prè^ç n = ç æ è=1: In other words, the minimizer of ^Jn èçè would be equal to the true values of the parameter with probability tending to unity as nincreases. It is also important to point out that all these results can be obtained without requiring the x t to have the same distribution as long as they are independent ëë, ëë. Therefore the supereæciency applies even to nonstationary signals. To demonstrate these results, let us consider a simple nonminimum-phase MAèè system ëë y t =,1:x t +:x t,1,x t, where fx t g is a binary sequence with Prèx t =0è=p t and Prèx t =1è=1,p t. For the general MAèè model y t = b0x t + b1x t,1 + bx t,,we assume b0 + b1 + b =1 and reparametrize the resulting two-parameter system with the zeros of the polynomial b0z + b1z + b denoted by ç = è;è. Therefore, in this example, ç æ =è æ ;zæ è=è1=;è. To compare with other methods which make no use of the discreteness of the input signal, we consider the well-known procedure of maximizing the standardized skewness ëë, ëë, ë9ë ^S n èçè = j^c è^u t èj è^cè^u t èè = where ^u t = ^u t èçè is the output of the inverse ælter in èè and ^c k è^u t è the k-th order sample cumulant of^u t. Two cases are considered: In Case 1 the input signal fx t g is stationary with p t =0: for all t, while in Case it is nonstationary with p t = æèsinètç=18èè where æèæè is the distribution function of the standard normal random variable. In both cases a random sample of size n = 1000 is used in the computation of ^Jn èçè and ^Sn èçè, and the contour plots of these criteria are presented in Figures 1í. As we can see from Figs. 1 and, the èbinarinessè criterion ^Jn èçè has a very sharp valley near the true value ç æ èindicated by +èinboth stationary and nonstationary cases. This implies that minimizing ^Jn èçè will produce very precise estimates for both stationary and nonstationary input signals. On the other hand, 11

.0.0.. 1. 1. 0.0 0.1 0. 0. 0. 0. 0. 0.0 0.1 0. 0. 0. 0. 0. Fig. 1. Contour of ^Jnèçè: Stationary case. Fig.. Contour of ^Snèçè: Stationary case..0.0.. 1. 1. 0.0 0.1 0. 0. 0. 0. 0. 0.0 0.1 0. 0. 0. 0. 0. Fig.. Contour of ^Jnèçè: Nonstationary case. Fig.. Contour of ^Snèçè: Nonstationary case. the standardized skewness ^Sn èçè has a rather broad peak near ç æ in the stationary case èfig. è. Although a solution to the deconvolution problem is provided by maximizing ^Sn èçè, the broad peak in ^Sn èçè as shown by Fig. may yield inaccurate estimates for ç æ. To make things even worse, the peak completely disappears in ^Sn èçè for the nonstationary signal èfig. è. This reveals how crucial the stationarity may be to the successful implementation of procedures like maximization of the standardized skewness. It is evident that the advantage of ^Jn èçè comes primarily from its utilization of the discreteness of the input signal.. A Gibbs Sampling Procedure Suppose fs j g in è1è is an FIR system operated in a noisy environment so that fy t g is obtained from y t = ç j x t,j + æ t è8è where fæ t g is Gaussian white noise with unknown variance ç. For the input signal, we assume that fx t g is a ærst-order Markov chain with state space A, unknown initial probabilities ç i =Prèx1,q=a i è, and unknown transition probabilities ç ij = Prèx t = a j jx t,1 = a i è. The blind deconvolution èor restorationè problem becomes the joint estimation of all the unknown parameters ç =ëç0;::: ;ç q ë T, ç =fç i ;ç ij g, and ç, and the recovery of the unknown input x = fx1,q;::: ;x n g, solely from a ænite data set y = fy1;::: ;y n g. It should 1

be pointed out that most of the previously mentioned methods of blind deconvolution do not directly apply to this situation since the input signal fx t g is colored and its moments unknown. To deal with this problem, Chen and this author have recently combined the Bayesian approach with a Gibbs sampling procedure ëë. The gist of method can be summarized as follows. Upon regarding all the unknowns as independent random variablesèvectors, a multivariate Gaussian distribution and an inverse chisquire distribution are used as priors for ç and ç, respectively, so that ç ç Nèç0 ; æ 0è and ç ç ç, èç; çè èi.e., çç=ç ç ç èçèè. Dirichlet distributions are employed as priors for the ç's, namely èç1;::: ;ç m è ç Dèæ1;::: ;æ m è and èç i1;::: ;ç im è ç Dèæ i1;::: ;æ im è; so that pèç1;::: ;ç Q Q P m è è ç æi i P with ç i = 1 and pèç i1;::: ;ç im è è ç æij j with j ç ij = 1. Selection of the parameters in these priors reæects the a priori information about the unknowns. For instance, small values of ç and ç or large variances in æ0 correspond to less informative priors suitable for the situations where information about ç and ç is limited. Jeæray's non-informative Dirichlet prior for èç1;::: ;ç m è corresponds to æ i =,1= while in general æ i é,1. According to the Bayesian approach, one is interested in seeking the conditional expectation Eèx t jyè or the mode of the conditional probability pèx t jyè, for instance, as estimates of x t. The diæculty is that any direct computation of these estimates seems impossible because of the complexity of the problem èmore unknowns than observationsè. Alternatively, one may employ the Monte Carlo method with a Gibbs sampler. The idea of Gibbs sampling is to construct a Markov chain by recursively generating random samples from the conditional posterior distribution of an individual or a subset of the unknowns given the data y and the rest unknowns. This procedure continues until the sampling Markov chain converges in distribution. In this case, the random samples generated by the Gibbs sampler can be regarded as ergodic samples from the joint posterior distribution pèx; ç;ç ;çjyè, so the simple average of the x t components and the maximum relative frequency of x t = a i obtained from these samples, for example, will approximate the conditional expectation èmmse estimatorè Eèx t jyè and the MAP estimator modefpèx t jyèg, respectively. It is not too diæcult to derive for the Gibbs sampler the conditional posterior distributions of the unknowns in our problem. As a matter of fact, it can be shown ëë that the conditional posterior distribution of ç given y and the rest unknonws is Gaussian with mean vector ç1 and covariance matrix æ1, i.e., where æ,1 1 = ç1 pèç j rest; yè ç Nèç1 ; æ 1è nx t=1 = æ1 xtx T t =ç +æ,1 0 and è nx xty t =ç +æ,1 0 ç 0 t=1 with xt =ëx t ;æææ ;x t,q ë T. Similarly, it can be shown ëë that pèç j rest; yè ç ç, èç + n; çç + s è, pèç1;::: ;ç m jrest; yè ç Dèæ1 + æ1;::: ;æ m +æ m è; pèç i1;::: ;ç im j rest; yè ç Dèæ i1 + n i1;::: ;æ im + n im è; where s = P n t=1 èy t, P q ç jx t,j è, n ij =èfèx t ;x t,1 è=èa i ;a j èg;! and and æ i =1ifx1,q=a i and æ i =0ifx1,q=a i. For any æxed t 0 f1,q;::: ;ng, the conditional posterior distribution of x t 0 can be expressed as Prèx t 0 = a i j rest; yè è pèx 0 jçè expè,s 0 =èç èè where x 0 = fx 0 1,q ;::: ;x0 n g with x0 t = a 0 i and x 0 t = x t for t = t 0, and s 0 P P n = t=1 èy q t, ç jx 0 t,j è. Note that under Q the Markovian assumption of fx t g we have pèxjçè=è ç èèq æi i ç nij ij è. As an example of the Gibbs sampling procedure, let us consider the MAèè system y t =,0:18x t +0:91x t,1 +0:81x t,, 0:198x t, + æ t where fx t g is a four-level Markov chain with A = f,;,1; 1; g, ç i =1=, and ëç ij ë= : : : : : : : : :1 : : : :1 : : : : A realization of fx t g with n = 100 is shown in Fig. èaè and the corresponding fy t g shown in Fig. èbè. The 1

sample variance of fæ t g is adjusted so that the signalto-noise ratio in fy t g equals 1 db. The parameters in the prior distributions are chosen as follows: ç0 = 0, æ0 = 1000 I, ç =, ç = 0:, and æ i = æ ij = 1. Fig. 1ècè shows the i.i.d. uniform initial guess for fx t g in the Gibbs sampler, and Figs. èdè and èeè present the conditional mean and mode of x t given y, i.e., Eèx t jyè and modefpèx t jyèg, respectively, calculated from the last 00 samples of the total 1000 iterations of Gibbs sampling. The constraints ç1 ç 0: and ç1 çjç i j+0: for i = 1 are used to remove the sign and shift ambiguities in the solution. Estimates of ç p and ç ij are given in the form of Eèæjyè æ V èæjyè by, respectively, and è,0:19; 0:89; 0:9;,0:181è æ è0:0191; 0:019; 0:00; 0:011è : :1 : :18 :1 : :0 :1 :0 : : :19 :1 :1 :9 : æ :11 :08 :09 :08 :0 :08 :08 :0 :0 :0 :08 :0 :08 :0 :10 :09 It is evident by comparing Figs. èdè and èeè with Fig. èaè that the MAP estimator modefpèx t jyèg completely recovers the input signal fx t g from the noisy data while the recovery by the MMSE estimator Eèx t jyè is almost complete except for the last point, even though the sample size is relatively small. The estimates for the system parameters and the transition probabilities are reasonably accurate given that n is merely 100. This demonstrates again the impact of the discreteness of input signals on the improvement of blind deconvolution solutions. References ë1ë P.J. Brockwell and R.A. Davis, Time Series: Theory and Methods, nd Ed., New York: Springer, 1991. ëë R. Chen and T.H. Li, ëblind restoration of linearly degraded discrete signals by Gibbs sampler," Tech. Rep. 19, Dept. of Statist., Texas A&M University, College Station, 199. ëë Q. Cheng, ëmaximum standardized cumulant deconvolution of non-gaussian processes", Ann. Statist., vol. 18, pp. 1í18,1990. ëë D. Donoho, ëon minimum entropy deconvolution," in Applied Time Series Analysis II, D. Findley, Ed., New York: Academic, 1981. ëë D.N. Godard, ëself-recovering equalization and carrier tracking in two-dimensional data communication systems," IEEE Trans. Commun., vol. COM-8, pp. 18í18, Nov. 1980. ëë T.H. Li, ëblind identiæcation and deconvolution of linear systems driven by binary random sequences," : (a) x(t) [markov] (b) y(t) [1dB] (c) x0(t) [iid unif] (d) E(x(t) y) (e) mode of p(x(t) y) 1-1 - - 8 0 - -8-1 -1-1 -1 - - 1-1 - - Fig.. Deconvolution by Gibbs sampling. IEEE Trans. Inform. Theory, vol. IT-8, pp. í8, Jan. 199. ëë T.H. Li, ëblind deconvolution of linear systems with nonstationary multilevel inputs," Proc. IEEE Signal Process. Workshop on Higher-Order Statist., S. Lake Tahoe, CA, pp. 10í1, June 199. ë8ë J.G. Proakis, Digital Communications, nd Edn, New York: McGraw-Hill, 1989. ë9ë O. Shalvi and E. Weinstein, ënew criteria for blind deconvolution of nonminimum phase systems èchannelsè," IEEE Trans. Inform. Theory, vol. IT-, pp. 1í1, Mar. 1990. ë10ë J.K. Tugnait, ëinverse ælter criteria for estimation of linear parametric models using higher order statistics," Proc. ICASSP-91, pp. 101í10. 1