REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca. Siemens Corporate Research Princeton, NJ 08540

Similar documents
REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca

AUDIO INTERPOLATION RICHARD RADKE 1 AND SCOTT RICKARD 2

ROBUSTNESS OF PARAMETRIC SOURCE DEMIXING IN ECHOIC ENVIRONMENTS. Radu Balan, Justinian Rosca, Scott Rickard

Single Channel Signal Separation Using MAP-based Subspace Decomposition

Phase Aliasing Correction For Robust Blind Source Separation Using DUET

Tutorial on Blind Source Separation and Independent Component Analysis

Soft-LOST: EM on a Mixture of Oriented Lines

Blind Machine Separation Te-Won Lee

LECTURE :ICA. Rita Osadchy. Based on Lecture Notes by A. Ng

Information Theory Based Estimator of the Number of Sources in a Sparse Linear Mixing Model

TRINICON: A Versatile Framework for Multichannel Blind Signal Processing

Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator

A comparative study of time-delay estimation techniques for convolutive speech mixtures

Blind Source Separation with a Time-Varying Mixing Matrix

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

HST.582J/6.555J/16.456J

A METHOD OF ICA IN TIME-FREQUENCY DOMAIN

x 1 (t) Spectrogram t s

Speech Source Separation in Convolutive Environments Using Space-Time-Frequency Analysis

Musical noise reduction in time-frequency-binary-masking-based blind source separation systems

Independent Component Analysis and Unsupervised Learning

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 2, FEBRUARY

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain

SINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS

NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Independent Components Analysis

Lecture Notes 5: Multiresolution Analysis

Proc. of NCC 2010, Chennai, India

A NEW IDENTITY FOR PARSEVAL FRAMES

Source counting in real-time sound source localization using a circular microphone array

Robotic Sound Source Separation using Independent Vector Analysis Martin Rothbucher, Christian Denk, Martin Reverchon, Hao Shen and Klaus Diepold

where A 2 IR m n is the mixing matrix, s(t) is the n-dimensional source vector (n» m), and v(t) is additive white noise that is statistically independ

Acoustic Source Separation with Microphone Arrays CCNY

ESTIMATION OF RELATIVE TRANSFER FUNCTION IN THE PRESENCE OF STATIONARY NOISE BASED ON SEGMENTAL POWER SPECTRAL DENSITY MATRIX SUBTRACTION

A Source Localization/Separation/Respatialization System Based on Unsupervised Classification of Interaural Cues

DIRECTION ESTIMATION BASED ON SOUND INTENSITY VECTORS. Sakari Tervo

SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan

Probabilistic Inference of Speech Signals from Phaseless Spectrograms

PHONEME CLASSIFICATION OVER THE RECONSTRUCTED PHASE SPACE USING PRINCIPAL COMPONENT ANALYSIS

a 22(t) a nn(t) Correlation τ=0 τ=1 τ=2 (t) x 1 (t) x Time(sec)

AN INVERTIBLE DISCRETE AUDITORY TRANSFORM

A wavenumber approach to characterizing the diffuse field conditions in reverberation rooms

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

Underdetermined Instantaneous Audio Source Separation via Local Gaussian Modeling

Acoustic Vector Sensor based Speech Source Separation with Mixed Gaussian-Laplacian Distributions

Bayesian Regularization and Nonnegative Deconvolution for Time Delay Estimation

Comparison of DDE and ETDGE for. Time-Varying Delay Estimation. H. C. So. Department of Electronic Engineering, City University of Hong Kong

Acoustic MIMO Signal Processing

Alpha-Stable Distributions in Signal Processing of Audio Signals

Decompositions of frames and a new frame identity

Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen

Voice Activity Detection Using Pitch Feature

Modifying Voice Activity Detection in Low SNR by correction factors

USING STATISTICAL ROOM ACOUSTICS FOR ANALYSING THE OUTPUT SNR OF THE MWF IN ACOUSTIC SENSOR NETWORKS. Toby Christian Lawin-Ore, Simon Doclo

Reduced-cost combination of adaptive filters for acoustic echo cancellation

A LOCALIZATION METHOD FOR MULTIPLE SOUND SOURCES BY USING COHERENCE FUNCTION

Separation and location of microseism sources

Source localization in an ocean waveguide using supervised machine learning

Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation

IMPROVEMENTS IN ACTIVE NOISE CONTROL OF HELICOPTER NOISE IN A MOCK CABIN ABSTRACT

A Probability Model for Interaural Phase Difference

arxiv: v1 [cs.sd] 30 Oct 2015

Introduction to Acoustics Exercises

Unsupervised learning: beyond simple clustering and PCA

EIGENFILTERS FOR SIGNAL CANCELLATION. Sunil Bharitkar and Chris Kyriakakis

ON-LINE BLIND SEPARATION OF NON-STATIONARY SIGNALS

1 Introduction Blind source separation (BSS) is a fundamental problem which is encountered in a variety of signal processing problems where multiple s

Covariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation

A Fast Algorithm for. Nonstationary Delay Estimation. H. C. So. Department of Electronic Engineering, City University of Hong Kong

Independent Component Analysis (ICA)

MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka

Independent Component Analysis

Estimating Correlation Coefficient Between Two Complex Signals Without Phase Observation

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology

Gaussian Processes for Audio Feature Extraction

MATH 680 Fall November 27, Homework 3

A Generalization of Blind Source Separation Algorithms for Convolutive Mixtures Based on Second-Order Statistics

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

International Journal of Advanced Research in Computer Science and Software Engineering

Discrete Simulation of Power Law Noise

Statistical Machine Learning

FUNDAMENTAL LIMITATION OF FREQUENCY DOMAIN BLIND SOURCE SEPARATION FOR CONVOLVED MIXTURE OF SPEECH

SUPERVISED INDEPENDENT VECTOR ANALYSIS THROUGH PILOT DEPENDENT COMPONENTS

FuncICA for time series pattern discovery

TIME DOMAIN ACOUSTIC CONTRAST CONTROL IMPLEMENTATION OF SOUND ZONES FOR LOW-FREQUENCY INPUT SIGNALS

392D: Coding for the AWGN Channel Wednesday, January 24, 2007 Stanford, Winter 2007 Handout #6. Problem Set 2 Solutions

Cepstral Deconvolution Method for Measurement of Absorption and Scattering Coefficients of Materials

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.

Statistical and Adaptive Signal Processing

An EM Algorithm for Localizing Multiple Sound Sources in Reverberant Environments

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

BLIND SEPARATION OF TEMPORALLY CORRELATED SOURCES USING A QUASI MAXIMUM LIKELIHOOD APPROACH

Transcription:

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research Princeton, NJ 84 fscott.rickard,radu.balan,justinian.roscag@scr.siemens.com ABSTRACT We present a real-time version of the DUET algorithm for the blind separation of any number of sources using only two mixtures. The method applies when sources are W- disjoint orthogonal, that is, when the supports of the windowed Fourier transform of any two signals in the mixture are disjoint sets, an assumption which is justified in the Appendix. The online algorithm is a Maximum Likelihood (ML) based gradient search method that is used to track the mixing parameters. The estimates of the mixing parameters are then used to partition the time-frequency representation of the mixtures to recover the original sources. The technique is valid even in the case when the number of sources is larger than the number of mixtures. The method was tested on mixtures generated from different voices and noises recorded from varying angles in both anechoic and echoic rooms. In total, over mixtures were tested. The average SNR gain of the demixing was db for anechoic room mixtures and db for echoic office mixtures. The algorithm runs times faster than real time on a 7MHz laptop computer. Sample sound files can be found here: http://www.princeton.edu/ srickard/bss.html. INTRODUCTION In [] a blind source separation technique was introduced that allows the separation of an arbitrary number of sources from just two mixtures provided the time-frequency representations of sources do not overlap. The key observation in the technique is that, for mixtures of such sources, each time-frequency point depends on at most one source and its associated mixing parameters. In anechoic environments, it is possible to extract the estimates of the mixing parameters from the ratio of the time-frequency representations of the mixtures. These estimates cluster around the true mixing parameters and, identifying the clusters, one can partition the time-frequency representation of the mixtures to produce the time-frequency representations of the original sources. The original DUET algorithm involved creating a twodimensional (weighted) histogram of the relative amplitude and delay estimates, finding the peaks in the histogram, and then associating each time-frequency point in the mixture with one peak. The original implementation of the method was offline and passed through the data twice; one time to create the histogram and a second time to demix. In this paper, we present an online version of the DUET algorithm which avoids the need for the creation of the histogram, which in turn avoids the computational load of updating the histogram and the tricky issue of finding and tracking peaks. The online DUET advantages are, online ( times faster than real time), db average separation for anechoic mixtures, db average separation for echoic mixtures, and can demix > sources from mixtures. In Section we define the time delay mixing model, explain the concept of W-disjoint orthogonality, and describe the mixing parameter tracking procedure. In Section 3 we describe the demixing method used for each algorithm. Section 4 describes the mixing tests and gives detailed demixing results. Justification for the W-disjoint orthogonality assumption can be found in Appendix A, and Appendix B contains the ML objective function derivation.. MIING PARAMETER ESTIMATION.. Source mixing Consider measurements of a pair of sensors where only the direct path is present. In this case, without loss of generality, we can absorb the attenuation and delay parameters of the first mixture, x (t), into the definition of the sources. The two mixtures can thus be expressed as, x (t) x (t) N j N j s j (t) () a j s j (t ; j ) () Presented at the ICA Conference, December 9-3,, San Diego, CA.

where N is the number of sources, j is the arrival delay between the sensors resulting from the angle of arrival, and a j is a relative attenuation factor corresponding to the ratio of the attenuations of the paths between sources and sensors. We use to denote the maximal possible delay between sensors, and thus, j j j 8j... Source Assumptions We call two functions s (t) and s (t) W-disjoint orthogonal if, for a given windowing function W (t), the supports of the windowed Fourier transforms of s (t) and s (t) are disjoint. The windowed Fourier transform of s j (t) is defined, Z F W (s j ())( ) W (t ; )s j (t)e ;it dt (3) ; which we will refer to as S j ( ) where appropriate. The W-disjoint orthogonalityassumption can be stated concisely, S ( )S ( ) 8 : (4) In Appendix A, we introduce the notion of approximate W- disjoint orthogonality. When W (t), we use the following property of the Fourier transform, F W (s j (;))( ) e ;i F W (s j ())( ): () We will assume that () holds for all, jj, even when W (t) has finite support[]..3. Amplitude-Delay Estimation Using the above assumptions, we can write the model from () and () for the case with two array elements as, ( ) ( ) ::: a e ;i ::: a N e ;i N 6 4 S ( ) 3 7. S N ( ) (6) For W-disjoint orthogonal sources, we note that at most one of the N sources will be non-zero for a given ( ), thus, ( ) ( ) a j e ;ij S j ( ) for some j. The original DUET algorithm estimated the mixing parameters by analyzing the ratio of ( ) and ( ). In light of (7), it is clear that mixing parameter estimates can be obtained via, (^a( ) ^( )) ( ) ( ) ln Im ( ) ( ) (7) : (8) The original DUET algorithm constructed a -D histogram of amplitude-delay estimates and looked at the number and location of the peaks in the histogram to determine the number of sources and their mixing parameters. See [, 3] for details..4. ML Mixing Parameter Gradient Search For the online algorithm, we take a different approach. First, note that, j ( )a j e ;ij ; ( )j (9) if source j is the active source at time-frequency ( ). Moreover, defining, (a j j ) : j +a ( )a j e ;ij ; ( )j j () we can see that, min((a ) ::: (a N N )) () because at least one term will be zero at each frequency. In Appendix B, it is shown that the maximum likelihood estimates of the mixing parameters satisfy, min min( ::: N ) () a ::: a N N where we have used j as shorthand for (a j j ). We perform gradient descent with () as the objective function to learn the mixing parameters. In order to avoid the discontinuous nature of the minimum function, we approximate it smoothly as follows, min( ) + ;j ; j (3) + ; ( ; ) (4) ; ln(e; + e ; ) () where, Z x ; e ;t (x) +e ;t dt x + ln( + e; ): (6) Generalizing (), the smooth ML objective function is, J( ) min ; a ::: a N N ln(e; + + e ;N ) (7)

which has partials, @J( ) @ j and, @J( ) @a j e ;j ;a j Pk e;k +a Imf ( ) ( )e ;ij g j P P e; j k e; k (+a j ) (((a j ; )Ref ( ) ( )e ;ij g (8) +a j (j( )j + j ( )j )): (9) We then reconstruct the source using the appropriate dual window function[4]. In this way, we demix all the sources by partitioning the time-frequency representation of one of the mixtures. Note that because the method does not invert the mixing matrix, it can demix all sources even when the number of sources is greater than the number of mixtures (N >M). 4. TESTS AND SUMMARY We assume we know the number of sources we are searching for and initialize an amplitude and delay estimate pair to random values for each source. The estimates (a j [k] j [k]) for the current time k k (where is the time separating adjacent time windows) are updated based on the previous estimate and the current gradient as follows, a j [k] a j [k ; ] ; j [k] @J( k) @a j () j [k] j [k ; ] ; j [k] @J( k) () @ j where is a learning rate constant and j [k] is a time and mixing parameter dependent learning rate for time index k for estimate j. In practice, we have found it helpful to adjust the learning rate depending on the amount of mixture energy recently explained by the current estimate. We define, q j [k] ;(aj j k) e Pl j ( k )jj ( k )j e;(al l k) () and update the parameter dependent learning rate as follows, We tested the method on mixtures created in both an anechoic room and an echoic office environment. The algorithm used parameters : :9 and a Hamming window of size samples (with adjacent windows separated by 8 samples) in all the tests. For all tests, the method ran more than times faster than real time. For the anechoic test, the setup is pictured in Figure. Separate recordings at 6kHz were made of six speech files (4 female, male) taken from the TIMIT database played from a loudspeaker placed at the marks in the figure. Pairwise mixtures were then created from all possible voice/angle combinations, excluding same voice and same angle combinations, yielding a total of 63 mixtures (63 6 7 6). 6 o 9 o o 3 o o 9 o 8 o x o x 7 o 4 o o j [k] q j [k] P k m k;m q j [m] (3) where is a forgetting factor. 3. DEMIING In order to demix the jth source, we construct a time-frequency mask based on the ML parameter estimator (see (33) in Appendix B), (aj j ) (a m m ) 8m 6 j j( ) otherwise (4) The estimate for the time-frequency representation of the jth source is, ^S j ( ) j ( ) ( ): () Fig.. Experimental setup. Microphones are separated by.7 cm centered along the 8 o to o line. The s show the source locations used in the anechoic tests. The O s show the locations of the sources in the echoic tests. The SNR gains of the demixtures were calculated as follows. Denote the contribution of source j on microphone k as S jk ( ). Thus we have, ( ) S ( ) +S ( ) (6) ( ) S ( ) +S ( ) (7) As we do not know the permutation of the demixing, we 3

calculate the SNR gain conservatively, SNR max log k S k k S k log k S k ; k S k max log ks k ks k log ks k ks k SNR ; min log k S k k S k log k S k k S k + min log ks k ks k log ks k ks k In order to give the method time to learn the mixing parameters, the SNR results do not include the first half second of data. Figure shows the average SNR gain results for each angle difference. For example, the 6 degree difference results average all the -7, 4-, 7-3, -6, and 3-9 results. Each bar shows the maximum SNR gain, one standard deviation above the mean, the mean (which is labeled), one standard deviation below the mean, and the minimum SNR gain over all the tests (both SNR and SNR are included in the averages). The separation results improve as the angle difference increases. Figure 3 details the 3 degree difference results by angle comparison, averaging 3 tests per angle comparison. The performance is a function of the delay. That is, the worst performance is achieved for the smallest delay (corresponding to the 6-9 mixtures), the second worst performance is achieved for the second smallest delay (corresponding to the -4 mixtures), and so forth. 3 SNR Gain (db) 6.84.6.83.6.9.4 4 4 7 7 3 3 6 6 9 angle combination Fig. 3. Overall separation SNR gain by 3 degree angle pairing. Anechoic data. (see the O s in Figure ). Separation results for pairwise mixtures of voices (4 female, 4 male) are shown in Figure. Separation results for pairwise mixtures of voices (4 female, 4 male) and noises (line printer, copy machine, and vacuum cleaner) are shown in Figure 4. The results are considerably worse in the echoic case, which is not surprising as the method assumes anechoic mixing. However, the method does achieve db SNR gain on average and is real-time. SNR Gain (db)..69 7.33 7.84 8. 7.74 SNR Gain (db).8 3.8.9.6 6.4 6. 3 6 9 8 angle difference 3 6 9 8 angle difference Fig. 4. Comparison of overall separation SNR gain by angle difference. Echoic office data. Voice vs. Noise. Fig.. Comparison of overall separation SNR gain by angle difference. Anechoic data. Recordings were also made in an echoic office with reverberation time of ms, that is, the impulse response of the room fell to -6 db after ms. For the echoic tests, the sources were placed at, 9,,, and 8 degrees Summary results for all three testing groups (anechoic, echoic voice vs. voice, and echoic voice vs. noise) are shown in the table. We have presented a real-time version of the DUET algorithm which uses gradient descent to learn the anechoic mixing parameters and then demixes by partitioning the time-frequency representations of the mixtures. We have also introduced a measure of W-disjoint orthogo- 4

SNR Gain (db) 3.7. 6.6 4.7 6.43 6.7 graph that r(3) >:9 for all three, and thus say that the signals used in the tests were 9% W-disjoint orthogonal at 3 db. If we can correctly map time-frequency points with 3 db or more single source dominance to the correct corresponding output partition, we can recover the 9% of the energy of the original sources. The figure also demonstrates the W-disjoint orthogonality of six speech signals taken as a group and the fact that independent Gaussian white noise processes are less than % W-disjoint orthogonal at all levels. 3 6 9 8 angle difference Fig.. Comparison of overall separation SNR gain by angle difference. Echoic office data. Two Voices. AVV EVV EVN number of tests 63 6 48 mean SNR gain (db).3.9 4.4 std SNR gain (db).69 3.34.87 max SNR gain (db).6.8 4.6 min SNR gain (db) -. -.4 -. Table. Results Summary. AVV Anechoic Voice vs. Voice. EVV Echoic Voice vs. Voice. EVN Echoic Voice vs. Noise nality and provided empirical evidence for the approximate W-disjoint orthogonality of speech signals. A. W-DISJOINT ORTHOGONALITY OF SPEECH Clearly, the W-disjoint orthogonality assumption is not exactly satisfied for our signals of interest. We introduce here a measure of W-disjoint orthogonality for a group sources and show that speech signals are indeed nearly W-disjoint orthogonal to each other. Consider the time-frequency mask, x ( ) log(js ( )jjs ( )j) >x otherwise (8) and the resulting energy ratio, r(x) k x ( )S ( )k ks ( )k (9) which measures the percentage of energy of source for time-frequency points where it dominates source by x db. We propose r(x) as a measure of W-disjoint orthogonality. For example, Figure 6 shows r(x) averaged for pairs of sources used in the demixing tests. We can see from the % power remaining.9.8.7.6..4.3.. Anchoic Voice vs. Voice Echoic Voice vs. Voice Echoic Voice vs. Noise Six Anechoic Voices Gaussian White Noise 3 db threshold Fig. 6. W-Disjoint Orthogonality. The signals used in the tests were 9% W-disjoint orthogonal at 3 db and more than 8% W-disjoint orthogonal at db. Comparing one source to the sum of five others, we still have more than 7% W- disjoint orthogonality at 3 db. Independent Gaussian white noise processes are less than % W-disjoint orthogonal at all levels. B. ML ESTIMATION FOR DUET MODEL Assume a mixing model of type ()-() to which we add measurement noise: ( ) ( ) N j N j q j ( )Sj ( ) +( ) (3) a je ;i j q j( )Sj( ) +( ) (3) The ideal model ()-() is obtained in the limit. In practice, we make the computations assuming the existence of such a noise, and then we pass to the limit. We assume the noise and source signals are Gaussian distributed and independent from one another, with zero mean and known variances: ( ) N( I ( ) )

S j ( ) N( j ()) The Bernoulli random variables q j ( ) s are NOT independent. To accommodate the W-disjoint orthogonality assumption, we require that for each ( ) at most one of the q j ( ) s can be unity, and all others must be zero. Thus the N-tuple (q ( ) ::: q N ( )) takes values only in the set Q f( ::: ) ( ::: ) ::: ( ::: )g of cardinality N +. We assume uniform priors for these R.V. s. The short-time stationarity implies different frequencies are decorrelated (and hence independent) from one another. We use this property in constructing the likelihood. The likelihoodof parameters (a ::: a N N ) given the data ( ( ) ( )) and spectral powers, j () at a given, is given by conditioning with respect to q j ( ) s by: L(a ::: a N N ) Y : p( () ()ja ::: a N N j) N j exp f;m g det( I + j();j()) p(q j ( ) ) (3) where: h M ( ) ( ) and ; j a je ;i j i ( ; ( ) I + j();j()) ( ) aje i j a je i j a je ;i j a j and we have defined q ( ) ; P N k q k( ), (), and ; () I for notational simplicity in (3) in dealing with the case when no source is active at a given ( ). Next the Matrix Inversion Lemma (or an explicit computation) gives: and ; M ; + j()( + a j ) ( j()jaje ;i j ( ) ; ( )j + (j ( )j + j ( )j )) det( I + j (); j ()) ( + j ()( + a j )) Now we pass to the limit. The dominant terms from the previous two equations are: and ; ja j e ;ij ( ) ; ( )j +a j j ()( + a j ) Of the N +terms in each sum of (3), only one term is dominant, namely the one of the largest exponent. Assume : 7 f ::: Ng is the selection map defined by: () k if (a k k ) (a j j ) 8j 6 k (33) where: (a )j ( )j + j ( )j and for k f ::: Ng: (a k k ) ja je ;ij ( ) ; ( )j +a j Then the likelihood becomes: L(a ::: a N N ) C M NY Y k ; (k) t k exp with M, the number of frequencies and: t k ( ; (a k k ) k k f ::: Ng k()(+a k ) (34) The dominant term in log-likelihood remains the exponent. Thus: log L ; N k ; (k) (a k k ) (3) and maximizing the log-likelihood is equivalent to the following (which is ()): min min((a ) (a N N )) a ::: a N N 6. REFERENCES [] A. Jourjine, S. Rickard, and O. Yilmaz, Blind Separation of Disjoint Orthogonal Signals: Demixing N Sources from Mixtures, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, June, vol., pp. 98 88. [] R. Balan, J. Rosca, S. Rickard, and J. O Ruanaidh, The influence of windowing on time delay estimates, in Proceedings of the CISS, Princeton, NJ, March -7. [3] A. Jourjine, S. Rickard, and O. Yilmaz, Blind Separation of Disjoint Orthogonal Signals, IEEE Trans. Signal Proc.,, Submitted. [4] I. Daubechies, Ten Lectures on Wavelets, chapter 3, SIAM, Philadelphia, PA, 99. 6