MAXIMUM LIKELIHOOD BASED TECHNIQUES FOR BLIND SOURCE SEPARATION AND APPROXIMATE JOINT DIAGONALIZATION

Size: px

Start display at page:

Download "MAXIMUM LIKELIHOOD BASED TECHNIQUES FOR BLIND SOURCE SEPARATION AND APPROXIMATE JOINT DIAGONALIZATION"

Marjory Watts
5 years ago
Views:

1 BEN-GURION UNIVERSITY OF TE NEGEV FACULTY OF ENGINEERING SCIENCE DEPARTENT OF ELECTRICAL AND COPUTER ENGINEERING AXIU LIKELIOOD BASED TECNIQUES FOR BLIND SOURCE SEPARATION AND APPROXIATE JOINT DIAGONALIZATION Thesis subitted in partial fulfillent of the requireents towards the.sc. degree By Koby Todros October 006

2 Abstract In this wor, two novel algoriths for blind separation of noiseless instantaneous linear ixture of independent sources are presented. The proposed algoriths exploit non-gaussianity of the independent sources by odeling their distribution using the Gaussian ixture odel (G. The first proposed ethod is based on the axiu lielihood (L estiator. According to this ethod, the sensors distribution paraeters are estiated via the expectation-axiization (E algorith for G paraeter estiation and the separation atrix is estiated by applying nonorthogonal joint diagonalization of the estiated G covariance atrices. The second proposed ethod is also L-based approach. According to this ethod, the distribution paraeters of the pre-whitened sensors are estiated via the E algorith for G paraeter estiation and a unitary separation atrix is estiated by applying orthogonal joint diagonalization of the estiated G covariance atrices. It is shown that estiation of the sensors distribution paraeters via the E algorith for G paraeter estiation aounts to obtaining a tight lower bound on the log-lielihood of the separation atrix. It is also shown that joint diagonalization of the estiated G covariance atrices aounts to axiization of the obtained tight lower bound w.r.t. the separation atrix. The perforances of the two proposed ethods are evaluated and copared to existing blind source separation techniques. The results show superior perforances of the proposed ethods in ters of interference-to-signal ratio. In addition, a new efficient iterative algorith for approxiate joint diagonalization of positive-definite eritian atrices is presented. According to the proposed algorith, the joint diagonalization atrix is not constrained to be orthogonal and it is estiated by iterative optiization of a L-based objective function. The coluns of the joint diagonalization atrix are estiated separately using iterative singular value decopositions of a weighted su of the atrices to be diagonalized. This property enables low coputational load of the proposed joint diagonalization algorith, which is ost useful in cases of large aount of atrices. The perforance of the proposed algorith is evaluated and copared to other state-ofthe-art algoriths for approxiate joint diagonalization. The results iply that the proposed algorith is coputationally efficient with perforance siilar to state-of-the-art algoriths for approxiate joint diagonalization. I

3 Acnowledgent I would lie to dedicate this wor to y sister, Efrat Todros, and to y best friend, Zohar Levi, who supported and encouraged e through all this long and harsh period. I would also lie to greatly than y thesis supervisor, Dr. Joseph Tabriian, for his devoted and great supervisory. II

4 Contents. Introduction.... Application of Gaussian ixture odel for Blind Separation of Independent Sources5. DERIVATION OF TE SOURCE AND SENSOR DISTRIBUTION ODELS SOURCE DISTRIBUTION ODEL SENSOR DISTRIBUTION ODEL SOLUTION OF TE BSS PROBLE DERIVATION OF A L-BASED OBJECTIVE FUNCTION ESTIATION OF TE SEPARATION ATRIX VIA JOINT DIAGONALIZATION WIT NONORTOGONAL TRANSFORATIONS ESTIATION OF TE SEPARATION ATRIX VIA JOINT DIAGONALIZATION WIT ORTOGONAL TRANSFORATIONS ORE SENSORS TAN SOURCES SIULATIONS SYNTETIC DATA EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF SKEWNESS LEVEL EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF ROTATION ANGLE.4..3 EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF STATISTICAL DISTRIBUTION CLASS EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF SAPLE SIZE EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF DIENSION EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF SNR REAL DATA DISCUSSION AND CONCLUSIONS Fast Approxiate Joint Diagonalization of Positive-Definite eritian atrices DERIVATION OF A AXIU LIKELIOOD BASED OBJECTIVE FUNCTION INIIZATION ALGORIT CONVERGENCE... 4 III

5 3.4 INITIALIZATION COPUTATIONAL COPLEXITY ASPECTS SIULATIONS ORTOGONAL JOINT DIAGONALIZATION NONORTOGONAL JOINT DIAGONALIZATION BSS APPLICATION DISCUSSION AND CONCLUSIONS Suary CONCLUSIONS FUTURE RESEARC... 6 Appendix A... 6 A. DERIVATION OF A LOWER BOUND ON TE LOG-LIKELIOOD FUNCTION... 6 A. DERIVATION OF A TIGT LOWER BOUND ON TE LOG-LIKELIOOD FUNCTION A.3 UTILIZATION OF TE E ALGORIT FOR LOWER BOUND TIGTENING Appendix B Appendix C Appendix D Appendix E... 7 Appendix F... 7 Appendix G References IV

6 Abbreviations AASE Approxiated Averaged ean Square Error AC/DC Alternating Coluns/Diagonal Centers BSS Blind Source Separation E Expectation axiization FFDIAG Fast Forbenius Diagonalization FG Flurry Gautschi G Gaussian ixture odel ICA Independent Coponent Analysis IFA Independent Factor Analysis ISR Interference-to-Signal Ratio JADE Joint Approxiate Diagonalization of Eigenatrices KL Kullbac Leibler LRF Learning Rate Factor L axiu Lielihood NIFA Noiseless Independent Factor Analysis PDF Probability Density Function SVD Singular Value Decoposition SVDJD Singular Value Decoposition Joint Diagonalization V

7 List of Figures Fig.. The generative odel of the observation signals at tie instance t ( s (, Fig.. The log-lielihood function of ( β ρ B θ (solid curve with the observations atrix, X and its tight lower bound (dashed curve as a function of β.... Fig. 3. a Scatter plot of the source signals. b Scatter plot of the ixed sources. The ellipses represent the estiated covariance atrices. c Scatter plot of the estiated source signals Fig. 4. a Scatter plot of an arbitrary realization of ixed sources with sewness level of The ellipses represent the estiated covariance atrices. b The averaged ISR of the JADE, FastICA, GPA, GSVDJD, GFG and NIFA algoriths versus sewness level. Fig. 5. The averaged ISR of the tested algoriths as a function of rotation angle... 3 Fig. 6. a The averaged ISR of the tested algoriths versus the generalized Gaussian shape paraeter, β. b The averaged G order, deterined by the GJD and GFG algoriths according to the BIC, versus the generalized Gaussian shape paraeter, β... 4 Fig. 7. a The averaged ISR of the tested algoriths versus the saple size, b The averaged G order, deterined by the GJD and GFG algoriths according to the BIC, versus the saple size, c The averaged running tie of the tested algoriths as a function of the saple size... 6 Fig. 8. a The averaged ISR of tested algoriths as a function of the nuber of sources. b The averaged G order, deterined by the GJD and GFG algoriths according to the BIC, as a function of the nuber of source signals. c The averaged running tie of the tested algoriths as a function of the nuber of sources Fig. 9. The averaged ISR of the tested algoriths versus SNR... 9 Fig. 0. Scatter plot of the ixture of S and S. The ellipses represent the estiated covariance atrices, which asseble the G of the observation signals... 3 Fig.. Separation perforances of the tested algoriths in separating a two-diensional ixture of two speech signals, b three diensional ixture of three speech signals, c eightdiensional ixture of three speech signals Fig.. Typical averaged convergence patterns of the iniization algorith... 4 VI

8 Fig. 3. The estiated existence probability of the sufficient convergence condition versus K and σ Fig. 4. Illustration of a atrix set, having distinct clusters of eigenvalues Fig. 5. a The ean values, 5 th and 95 th percentiles of Q ( B obtained by the SVDJD, Pha s, AC/DC and FFDIAG algoriths. The ar denotes the ean value, and the lower and upper ars denote the 5 th and 95 th percentiles, respectively. b The averaged running tie per iteration and the averaged total running tie in seconds of each algorith Fig. 6. The ean values, 5 th and 95 th percentiles of the SVDJD, Pha s, AC/DC and FFDIAG algoriths as a function of the perturbation level,σ. Q ( B obtained by the SVDJD, Pha s, AC/DC and FFDIAG algoriths for various perturbation levels. The ar denotes the ean value, and the lower and upper ars denote the 5th and 95 th percentiles, respectively Fig. 7. The averaged running tie per iteration (a and the averaged total running tie (b of Fig. 8. a The averaged running tie per iteration and the averaged total running tie of the copared algoriths. b The values of the objective function, Q ( B, obtained by each algorith. c The averaged ISR, obtained by each algorith Fig. 9. a The averaged running tie per iteration and the averaged total running tie of the copared algoriths. b The values of the objective function, Q ( B, obtained by each algorith. c The averaged ISR, obtained by each algorith Fig. 0. An illustration of the log-lielihood function (solid line and its lower bounds tightening (dashed curves VII

9 . Introduction Blind separation of an instantaneous linear ixture of independent sources can be achieved, up to scaling and perutation of the estiated sources, by exploiting their non-gaussianity []. Soe existing ethods for BSS use restrictive assuptions on the sources distribution, which aes the inapplicable in soe cases. For exaple, cuulant-based ethods lie JADE [3], assue that the sources have non-zero 4 th order cuulant and the PDF of each source is approxiated by using only nd and 4 th order cuulants. In [4] it is shown that any density can be estiated to any desired degree of approxiation, in ters of Kullbac-Leibler (KL divergence [0], using a finite order G. Therefore, in this wor the probability density function (PDF of the non-gaussian sources is odeled by G. Several researchers have utilized the G in solving the BSS proble. For exaple, oulines et al. [4] developed an approxiate axiu lielihood (L ethod for blind separation and deconvolution of noisy linear ixtures, where the density of each source was odeled by a univariate G. According to this approach, an expectation-axiization (E algorith [0], which jointly estiates the ixing atrix, the source distribution paraeters and the noise covariance atrix, was developed. A siilar approach for blind separation of noisy linear ixtures, naed as independent factor analysis (IFA, was offered by Attias [5]. In contrast to [4], the intractability of the E algorith when the nuber of sources increases and the sources reconstruction proble were handled. In [5], Attias extended this ethod for the case of noiseless linear ixtures and an algorith, naed noiseless IFA (NIFA, for joint estiation of the sources distribution paraeters and the separation atrix was developed. This algorith is strictly E only for sufficiently sall, epirically selected, learning rate factor (LRF, used for updating the estiation of the separation atrix in each step of the algorith. In [6], Attias extended the IFA ethod for the case of teporally structured sources. In order to capture the teporal statistical properties of the observed data, each source was described by a hidden arov odel ( and a faily of E algoriths that learn the structure of the underlying sources and their relation to the observed data were derived. In [7], a constraint E algorith for blind separation of linear ixture with isotropic noise was developed. The ixing atrix in this algorith was subject to an orthogonality constraint. Under this constraint, exponential increase of coplexity with the nuber of sources was avoided. In [8] a odified E algorith for blind separation of independent linear sparse over-coplete noisy ixtures was derived. The sparsity assuption enabled a nuber of siplifying approxiations to the observations density, which avoided exponential growth of the nuber of ixture coponents.

10 The algoriths entioned above utilize an E algorith, which jointly estiates the unobserved source distribution paraeters and the ixing atrix coefficients. This approach has the following disadvantages. First, accurate initialization and order selection of the distribution odel of the unobserved source signals is difficult, so the E algorith ay converge into undesired axia. Second, ipleentation of this approach is cubersoe. In [9], a ethod for BSS in nonstationary environents using a non-gaussian ixture odel was developed. According to this ethod, the observed data were odeled as a ixture of several utually exclusive classes, where each class was described by a different noisy instantaneous linear ixture of independent non-gaussian densities. The density of the sources in each class was odeled by the generalized Gaussian density function [] and a gradient descent algorith for L estiation of the odel paraeters was derived. owever, in this contribution we focus our interest in blind separation of sources related to a unique class, where the density of each source can be odeled by G. In this wor, the BSS proble is solved in two separate steps. In the first step, the sensors distribution paraeters are estiated via the E algorith for G paraeter estiation []. It is shown that this operation, aounts to obtaining a tight lower bound on the log-lielihood of a function of the separation atrix. In the second step, the separation atrix is estiated by axiizing the tight lower bound, obtained in the first step, w.r.t. its entries. Based on this approach, two novel source separation techniques are proposed. The first proposed ethod is a L-based technique, which coprises the following steps: estiation of the sensors distribution paraeters via the E algorith for G paraeter estiation [], estiation of a separation atrix, which approxiately diagonalizes the estiated G covariance atrices siultaneously. The joint diagonalization is perfored according to an algorith offered by Pha [], [5] and according to a new joint diagonalization algorith, offered in this wor and denoted as the SVDJD algorith. BSS using PA and SVDJD techniques is denoted as the GPA and GSVDJD algoriths, respectively. The second proposed ethod is also a L-based technique, which coprises the following steps: estiation of the pre-whitened sensors distribution paraeters via the E algorith for G paraeter estiation [], estiation of a unitary separation atrix, which approxiately diagonalizes the estiated G covariance atrices siultaneously, according to an algorith offered by Flury and Gautschi [3]. BSS using the Flury Gautschi ethod is denoted as the GFG algorith. The two ethods are also presented in [30].

11 As entioned above, a new efficient iterative algorith for approxiate joint diagonalization of positive-definite eritian atrices is proposed in this wor. A variety of algoriths for approxiate joint diagonalization have been proposed in the literature. To nae a few, the FG algorith for siultaneous orthogonal transforation of several positive-definite syetric atrices to nearly diagonal for was proposed by Flury and Gautchi [3]; Cardoso and Soulouiac [4] proposed the extended Jacobi ethod for orthogonal joint diagonalization; An algorith for nonorthogonal joint diagonalization of positivedefinite eritian atrices was proposed by Pha [], [5]; Joint diagonalization of certain algebraically derived atrices via subspace fitting techniques was proposed by van der Veen [6]; Yeredor [7] proposed the AC/DC algorith for nonorthogonal joint diagonalization in the least-squares sense, using subspace fitting techniques; Joho and Rahbar [8] developed a ethod for approxiate joint diagonalization of correlation atrices using Newton ethods; Ziehe et. al [9] proposed a fast algorith for nonorthogonal joint diagonalization, naed as the FFDIAG algorith. any techniques for blind source separation [] utilize approxiate joint diagonalization algoriths. According to these techniques, a set of unnown atrices, which obey exact joint diagonalization, is estiated fro the observed data. A diagonalization atrix, which is usually the separation atrix, is estiated (up to scaling and perutation of rows by iniizing an objective function that easures the deviation of the diagonalized atrices fro diagonality. For exaple, joint diagonalization of fourth-order joint-cuulants atrices was perfored via the extended Jacobi ethod [4] in the JADE algorith, offered by Cardoso [3]; Pha and Cardoso [] utilized Pha s algorith [5] for joint diagonalization of an estiated set of covariance atrices in the context of blind separation of nonstationary Gaussian sources; Todros and Tabriian [30] utilized Pha s algorith [], [5], and the FG ethod [3] for joint diagonalization in the context of blind separation of independent sources using G [4]; van der Veen and Paulraj proposed joint diagonalization of certain algebraically derived atrices, in the contexts of blind separation of constant odulus sources [3]. In this wor, a new efficient iterative algorith for approxiate joint diagonalization of positivedefinite eritian atrices, naed as the SVDJD algorith is proposed. The positive-definite assuption is otivated by the fact that in any applications [], [30], the atrices to be diagonalized are covariance atrices of soe rando variables. According to the proposed algorith, a diagonalization atrix, which is not constrained to be orthogonal, is estiated by optiization of a L-based objective function, also used by Pha [5]. The coluns of the diagonalization atrix are estiated separately using iterative singular value decopositions (SVD of a weighted su of the atrices to be diagonalized. This technique enables 3

12 low coputational load, which is practical especially in cases of large aount of atrices. This ethod is also presented in [3]. The thesis is organized as follows. In Chapter, an application of the G for blind separation of independent sources is presented. In Chapter 3, a novel algorith for approxiate joint diagonalization of positive-definite eritian atrices is derived. Siulation results as well as discussion on each topic are given at the end of each chapter. Finally, Chapter 4 suarizes the ain points of this contribution. 4

13 . Application of Gaussian ixture odel for Blind Separation of Independent Sources Consider the following noiseless instantaneous linear ixture odel: xt = As t, t =,,..., T. (. The rando vector s = s,..., s t, t K, t T, representing K statistically independent sources at tie instance t, is ixed by a fixed unnown L K ( L K ixing atrix A. The observation vector x { = [ x,..., x ] t, t L, T obtained fro an array of L sensors. The proble of BSS addresses the reconstruction of the source vectors T t} t= s, by estiating a K L separation atrix B for which sˆ = Bx, t =,,..., T. (. t t In this wor, the BSS proble is solved in two separate steps. In the first step, a tight lower bound on the log-lielihood of a function of B is obtained by applying the E algorith [0] for G paraeter estiation [] of the sensors distribution paraeters. In the second step, the obtained tight lower bound is axiized w.r.t. B by applying approxiate joint diagonalization of the estiated G covariance atrices. This chapter is organized as follows. In Section., atheatical odels for the PDF s of the source and observation signals are derived. In Section., a novel technique for solving the BSS proble is presented under the assuption of equal nuber of sensors and sources( L = K. In Section.3, the case of ore sensors than sources is addressed, and in Section.4, the perforance of the proposed ethod is evaluated and copared to other existing ethods for BSS. Finally, Section.5 suarizes the ain points of this chapter. T is 5

14 . DERIVATION OF TE SOURCE AND SENSOR DISTRIBUTION ODELS In this section, derivation of the source and sensor distribution odels is carried out under the assuption of stationary and non-gaussian source signals... SOURCE DISTRIBUTION ODEL The PDF of the th source signal at tie instance t is odeled by G in the following anner: n ( (, ; = ϑ, Φ(, ; µ,, σ, f s θ s, =,..., K, s s t l t l l l = (.3 where Φ (;, denotes the proper coplex Gaussian density function and n denotes the nuber of Gaussians. The ixing proportions are denoted by of the Gaussians are denoted by { µ l, } n l = { ϑl, } and { σ l }, n l = n l = n, such that ϑ =. The eans and variances l = l,, respectively. The vector of unnown distribution paraeters of the th source signal is denoted by n θ = { ϑ, µ, }. It is noted that s, t =,,..., T ( s, l, l σ, l l= t, are i.i.d =,..., K. By applying the assuption of independent source signals, their joint PDF can be forulated as follows: K ( s ( s ( t; = s, t; = ( fs s θ f s θ n n nk l= l= lk = n n = w Φ l= l= ( s ;, ( s ;, ( s ;, = ϑ ϑ ϑ Φ µ σ Φ µ σ Φ µ σ = nk lk = = wφ ( s t; µ, C,, l, l Kl, K, t, l, l, t, l, l Kt, Kl, K Kl, K ( ( T T l, l,..., l s K, t,..., sk, t ; µ, l,..., µ K, l, diag σ,,..., K l σ K, lk (.4 where K = n is the total nuber of Guassians in the joint PDF and w ϑ ; =,..., are the = K = =, l ixing proportions of each Gaussian such that = Gaussians fro each source, i.e. l,..., l K, where th w =. The index, denotes a unique cobination of l [,..., ] th T source. The ean vector and covariance atrix of the Gaussian are denoted by µ = [ µ, l,..., µ K, l K ] n denotes a Gaussian index of the 6

15 and C = diag σ, l,..., σk, l, respectively. The vector of unnown paraeters of the joint PDF is denoted by ( s ( { w,, } = θ = µ C with diagonal covariance atrices. K. Equation (.4 iplies that the joint PDF of the sources is a ultivariate G.. SENSOR DISTRIBUTION ODEL In this subsection, the generative odel of the observation signals at tie instance t is utilized in order to derive an expression for their joint PDF. Fig. depicts a graphical odel corresponding to the generation process of an observation at tie instance t. According to this generative odel, at every tie instance a hidden indication vector, y T t = [ yt,,..., yt, ], which indicates the generating Gaussian of s, is randoized t by the following discrete PDF: where δ ( denotes the dirac's delta function and y t, ( t = δ ( t, fy y w y, = th, if st is generated by the Gaussian = 0, otherwise. (.5 (.6 idden values of s are then ixed by the ixing atrix A and an observation vector x is fored. According to the Bayes theore, t x ( t ( ( x f ; = E f ( t t; x x θ y xy x y θ, (.7 where E y denotes the expectation operator w.r.t. the rando vector y and the vector of unnown t distribution paraeters of the observation signals is denoted by { w,, } ( x θ = Aµ AC A. = (.8 Therefore, the PDF of x t is given by ( xy( ( x f ; = w f y = ; ( x x θ x θ, x t t t, = where according to the generative odel of the observation signals ( ( x t t ; ( t;,, (.9 fxy x y = θ =Φ x Aµ AC A. (.0 7

16 Thus, the joint PDF of the observation signals is also G with nondiagonal covariance atrices, as forulated in the following equation: ( ( x t; = wφ( t;, fx x θ x η R =, (. where η = Aµ and R = AC A. µ s t y t { w } = C A x t Fig.. The generative odel of the observation signals at tie instance t. The generation odel of the source signals iplies that at every tie instance the source vector, s, ay t be generated by a different set of Gaussians. Therefore, s and nonstationary ultivariate rando processes. t xt ay be viewed as non-gaussian and/or. SOLUTION OF TE BSS PROBLE In this section, two novel techniques for solving the BSS proble, under the assuption of equal nuber of sensors and sources (L=K, are derived. Direct axiization of the log-lielihood of B is analytically cubersoe. Therefore, the log-lielihood function is axiized w.r.t. B in two separate steps. In the first θ = w, µ, C with the = step, a tight lower bound on the log-lielihood of a function of B and ( s { } observations is obtained by applying the E algorith for G paraeter estiation []. In the second step, the obtained tight lower bound is axiized w.r.t. B and ( s θ. The basic difference between the two 8

17 proposed algoriths stes fro the anner in which the tight lower bound of the log-lielihood function is axiized w.r.t. to the entries of B... DERIVATION OF A L-BASED OBJECTIVE FUNCTION In this subsection, a tight lower bound on the log-lielihood of a function of B and θ ( s is derived by applying the E algorith for G paraeter estiation []. A L-based objective function is derived fro the obtained tight lower bound. In the case of equal nuber of sensors and sources and invertible ixing atrix A, (. and (. iply B= A. Therefore, it is iplied by (.8 that the vector of unnown distribution paraeters of the observation signals can be represented as a function of the separation atrix B and anner: B µ ( {,, ( }, s w ρ, ( x θ = B µ B C B = B θ = ( s θ in the following (. where = η and B C = R denote the ean vector and covariance atrix of the Gaussian B coponent, respectively. Therefore, where = [ ] B ˆ = arg ax ax log f, (.3 s ( ( s X; B, θ ( B θ ρ X x,..., x T denotes the atrix of observation vectors. Direct axiization of log f ( s X; ρ ( B, θ w.r.t. to B and paraeter estiation [], a tight lower bound on lo g f is obtained and a L based estiate of B can ( s ( x X; θ th θ is analytically cubersoe. ence, its lower bound is axiized instead. In Appendix A, it is shown that by utilizing the E algorith for G be achieved. As proved in Appendix A, a tight lower bound on log f ( x X; θ log f ( x ( (, ˆ x L D θ θ, x X; θ ( is given by: (.4 where ( x ( ( ˆ x, L = log ( + ˆ ˆ( log ( ˆ( log x x x x ( D θ θ f E f E f X; θl YXθ ; L XYθ, ; YXθ ; L, ; ˆ XYθ x L. (.5 9

18 The distribution paraeters of the observation signals, obtained in the final step of the E algorith for G paraeter estiation [] are denoted by θˆ η R ˆ. The function lo g f is the joint ( x = { wˆ, ˆ, } = L ( x XYθ, ; log-lielihood of ( x θ with the atrices of observation vectors and of their corresponding hidden indication vectors, denoted by X= [ x x ] and = [ ] (.4 that,..., T Y y,..., y T, respectively. ence, it is iplied by (. and ( s X; ρ ( B, θ ( s ( ( ( ˆ x L log f D ρ B, θ, θ, (.6 and therefore, ( ( ( ( ˆ B ˆ arg ax ax ρ B, θ, θ. s x = D ( s L B θ (.7 The following exaple is aied to exaine in a graphical anner the relation between log f ( s X; ρ ( B, θ ( s ( ( (,, ˆ x D ρ B θ θl. Twenty five hundred saples of two source signals were synthesized by the following ( s ( s ; θ t 4 G PDF: f = w Φ( s ; µ ; C, where the univariate G order of each source was. The s = t values of the ixing proportions, ean vectors and covariance atrices were: w = 0.5 =,..., 4, T µ = [ ], µ = [ 5,0] 5,5 T, µ = [ 5, 5] T, = [ 5,0] 3 T µ 4 C, diag 0,5, and = ( = diag ( C 0,, C 3 = diag ( 5,5 and C4 = diag ( 5,, respectively. The source signals were ixed by a unitary ixing cosα sinα atrix A = sinα cosα, where α = 45 o. The distribution paraeters of the ixed source signals were estiated via the greedy E algorith for G paraeter estiation []. Due to the fact that in this exaple A is unitary, the separation atrix B is of the for ( and ( ( ( β,, ˆ ( s D x L cos β sin β β = sin β cos β. ence, lo g fx; ρ( B β ( B ( ρ B θ θ can be setched as a function of β, as depicted in Fig.. According to this ( s ( ρ( B, θ, θˆ figure, one can observe that D ( functions are axiized for β = 45 o. ( β x L is a tight lower bound on lo g f and both ( s ; ρ ( ( β, X B θ 0

19 log f ( s X ; ρ ( B ( β, θ D ( ( ( ( ( β, s, ˆ x ρ B θ θl ( s Fig.. The log-lielihood function of ρ( B( β, θ (solid curve with the observations atrix, X and its tight lower bound (dashed curve as a function of β. The first and last ters in the r.h.s. of (.5 are independent of ( x θ and therefore B-independent, while its iddle ter is θ ( x -dependent and therefore B-dependent. Thus, by noralizing the iddle ter of (.5 by a factor of in the following anner: where the objective function, where T is the nuber of observation vectors, estiation of B can be perfored T ( s θ ( s ( ( ( ˆ x Bˆ = arg in in Q ρ B, θ, θ L, L B (.8 ( s ( ( ( ˆ x ρ B, θ, θ L ˆ ( x log ;, ;, ( s YXθ XYρ B θ Q = E f T ( L (.9 is the noralized conditional expectation of lo g f XYρ., ; B In the following, a strict analytical expression of Q ( (.9, the joint PDF of xt and y t is given by ( ( (, ˆ x L ρ B θ is derived. According to (.5, (.6 and

20 Assuing that X and Y is given by f ( ( s xyρ, ; B, θ ( s ( t, t; (, = wφ( t;, = y t, x y ρ B θ x η R. ( s ( B, θ (.0 x,..., xt and y,..., y T are teporally independent, the joint log-lielihood of ρ with ( s XYρ, ; ( B, θ T t= = ( w ( log f = y log Φ x ; η, R. t, t (. The conditional expectation of (. is T E ˆ ( x log ( γ, log ( ( (,, ; f = Φ, ;, t w xt η R, s YXθL XYρ B θ t= = (. where γ = E x [ y ]. (.3 t, ( Y X;ˆ θ t, L Since can have only discrete values of 0 and, γ can be calculated in the following anner: y t, t, ( ( ˆ x ( P y ˆ x ( γ = 0 P y = 0 x ; θ + = x ; θ. t, t, t L t, t L Therefore, applying the Bayes theore, γ t, is given by γ ( ( ˆ x ( x ( ˆ ˆ ( x ηˆ R ( ( ; Py ( = f x y = ; θ wˆ Φ x ; η, R P y ˆ x = = x θ = = ( ; ˆ wˆ Φ ;, ˆ t, t t, L t t, t, t L f xt θl = t. (.4 (.5 Using (. and (., (.9 can be rewritten as ( s ( ( ( ˆ x = γ, w Φ( ( T L t t T t= = Q ρ B, θ, θ log x ; B µ, B C B. (.6 According to (.6, one can notice that in order to estiate B a structure on η = B µ and R = B CB T } and { R = } = no { B,{ µ, } C = } is iposed. Since { η ˆ ˆ in θ are estiated without this ipose there are such that η ˆ = B µ and ˆ ( x L ˆ = C B R B. Therefore, the iniu of

21 Q ( ( ( (, s, ˆ x L { = } ρ B θ θ w.r.t. B,{ µ, C } is not strictly attained. It is noted that in the asyptotic case, where ˆ ( x ( x = θ θ this iniu is attained in a strict anner. L ( s Since (,, ˆ ( x ( ( ρ( B, θ s ( = ˆ x Q ρ B θ θl is iniized for θ, it is obvious that L { w } = In Appendix B, it is shown that T ( s ( ( ( ˆ x ρ B θ θ ˆ L = γt, wφ( xt B µ B CB T t= = ( s ( = Q (,, ˆ x ρ B θ θl. ( ( in Q,, log ;, { } ( s ( ( ( ˆ x ˆ ˆ ( ˆ ( ˆ L nor Q ρ B, θ, θ = w KL BR B C + µ Bη C µ Bη + const. = (.7 (.8 The ter c N (.,. c c KL Σ Σ is the Kullbac-Leibler divergence [0] of N ( 0, Σ fro N ( 0, Σ nor (, where denotes the proper coplex Gaussian PDF. The iniu of (.8 w.r.t. µ is obtained by setting µ = Bη ˆ. Therefore, (.8 can be reduced to the following for ( ( { } ˆ x,, ˆ ˆ Q B C θ = = L w KLnor BRB C 443 {. = positive diagonal sei-definite (.9 The Pythagorean property of the Kullbac-Leibler divergence [], [5] iplies that (.9 can be decoposed in the following anner { } ( ( { } ˆ x ˆ ˆ ( ˆ ( ˆ = Q B, C, θ = w KL BR B DIAG BR B + KL DIAG BR B C L nor nor =, (.30 where DIAG ( ˆ BR B denotes a diagonal atrix with the sae diagonal eleents of ˆ BR B. Thus, the objective function (.30 is iniized for a fixed value of B when C = DIAG ( B ˆ RB and the attained iniu is Q ( ( ˆ x ˆ, ˆ DIAG( ˆ B θl = w KL nor BRB BRB =. (.3 3

22 Therefore, we conclude, that under the G assuption, the objective function Q easures the deviation of { BRB } = ˆ ˆ segent are denoted by w and R, respectively. fro diagonality. The sae result was obtained in [] for the case of a bloc- Gaussian odel. According to this odel the observation signals are partitioned into consecutive quasistationary segents, where the relative proportion and covariance atrix of the ˆ In Appendix C, it is shown that the objective function (.3 can be rewritten as follows th quasi-stationary Q ( ( ( ( ( x ( wˆ det G ( B, θˆ = log DIA BRˆ B log det BRˆ B L =, (.3 ( where det denotes the deterinant operator. In the following, iniization of Q via joint diagonalization with nonorthogonal and orthogonal transforations is described... ESTIATION OF TE SEPARATION ATRIX VIA JOINT DIAGONALIZATION WIT NONORTOGONAL TRANSFORATIONS The iniu of Q ( ˆ ( B, θ x L is attained for a atrix B which jointly diagonalizes the estiated G covariance atrices. In this wor, two nonorthogonal approxiate joint diagonalization algoriths which iniize Q ( (, ˆ x L B θ w.r.t. B are evaluated. The first algorith, offered by Pha [], [5], is siilar to the Jacobi ethod [] and applies successive transforations on each pair of distinct rows of B, with the exception that the rows of B are not constrained to be orthogonal. BSS using Pha s algorith is denoted as the GPA algorith. The second algorith is a novel technique for approxiate joint diagonalization, offered in this wor and denoted by the SVDJD algorith. According to this ethod, the coluns of B are estiated separately using iterative singular value decopositions of a weighted su of the atrices to be diagonalized. Full description of this algorith is given in Chapter 3. BSS using the SVDJD ethod is denoted as the GSVDJD algorith. In suary, solution of the BSS proble via nonorthogonal joint diagonalization coprises the following steps: Estiate the distribution paraeters of the observation signals via the E algorith for G paraeter estiation []. If the G order is unnown, it ay be deterined using BIC [7], DL 4

23 [8] or AIC [9]. Estiate B by applying nonorthogonal joint diagonalization algoriths on the estiated G covariance atrices. The following exaple illustrates the step by step ipleentation of the algorith. Twenty five hundred saples of two source signals were synthesized by the following G PDF: 4 ( s ( = w Φ( fs s θ s µ C, where the univariate G order of each source was. The values of the t; t; ; = ixing proportions, ean vectors and covariance atrices were: w = 0.06, w = 0.4, w 3 = 0.4, , w = = [ ] T T = [ ] = [ ] T T µ,µ,µ,µ = [ ], = diag ( 5, 5 C = diag ( 0,,C 3 = ( 3, and C 4 = diag ( 3, 5,5 3 5, 5 signals is depicted in Fig. 3.a. The source signals were ixed by 4 5,5 C 0,,, respectively. The scatter plot of the independent source 5 3 A = 7, according to (.. Following the first step of the algorith, the distribution paraeters of the ixed source signals were estiated by applying the greedy E algorith for G paraeter estiation [], where the G order was set to 4. The estiated ean vectors and covariance atrices of the ixed sources are depicted in Fig. 3.b on top of the scatter plot of the observations. Following the second step of the algorith, the separation atrix was estiated by applying the joint diagonalization algorith, offered by Pha [], [5], on the ˆ estiated G covariance atrices. The resulting estiated separation atrix was B = The source signals were estiated according to (. and their scatter plot is depicted in Fig. 3.c. One can observe that due to the scaling and perutation abiguities of the BSS proble, the estiated source signals are scaled and peruted. 5

a b c Fig. 3. a Scatter plot of the source signals. b Scatter plot of the ixed sources. The ellipses represent the estiated covariance atrices. c Scatter plot of the estiated source signals.

.., T, this ethod iplies prewhitening of the observation signals, so their generative odel, described in Subsection.., is extended.

24 a b c Fig. 3. a Scatter plot of the source signals. b Scatter plot of the ixed sources. The ellipses represent the estiated covariance atrices. c Scatter plot of the estiated source signals...3 ESTIATION OF TE SEPARATION ATRIX VIA JOINT DIAGONALIZATION WIT ORTOGONAL TRANSFORATIONS Under the assuption of white source signals (i.e. cov( st = C= I t =,..., T, this ethod iplies prewhitening of the observation signals, so their generative odel, described in Subsection.., is extended. According to this extended odel, xt is prewhitened by a spatial whitening atrix, W, and zt = Wxt fored. Thus, under the assuption of equal nuber of sensors and sources (L=K, it is iplied by (. and (. that Therefore, according to (. and (.33 zt = Wxt = WAst = As % t. (.33 ˆt t t % t s = Bx = BW z = Bz. (.34 is Under the conditions entioned above, it is shown that B % is a rotation atrix. Clai.: Let cov( s = C= I t =,..., T, then prewhitening of the observation signals iplies that the separation t atrix B % is a rotation atrix. 6

25 Proof.: The singular value decoposition of A is given by A= U{ D{ V { (.35 orthonoral diagonal orthonoral According to (., the covariance atrix of an observation vector at tie instance t, is cov( xt = ACA = ψ. (.36 Substituting (.34 into (.36 iplies that, so the whitening atrix of is given by ψ = UD U xt Thus, it is iplied by (.33, (.35 and (.37 that Therefore, according to (.34 and (.38 where V is a rotation atrix W= D U. (.37 zt = WAst = V s t. (.38 B% = V, (.39 It is iplied by (.33 that by replacing A with A% = WA, the generation process of the prewhitened observation signals can be ebedded in the generative odel described in Subsection... ence, it can be shown in the sae anner described in Subsection.. that the joint PDF of the pre-whitened observation signals is also G with nondiagonal covariance atrices, as expressed below. ( ( z t; = wφ( t; fz z θ z η, R. (.40 The vector of unnown distribution paraeters of the prewhitened observation signals is denoted by = z { } { w,, } ( z ( = = = = θ θ η R, (.4 where Aµ % = η and AC % A% = R. ence, Q ( ˆ ( B%, θ z L is derived in the sae anner as ( (, ˆ x B θ L Q in (.3, such that 7

26 Q ( ( ( ( ( ( ˆ z ˆ ( ˆ w B%, θ = log det DIAG BR % B% log det BR % ˆ B%. L = (.4 The ixing proportions and covariance atrices of the prewhitened observation signals, estiated in the final step of the E algorith for G paraeter estiation [], are denoted by { w } = ˆ ˆ and { } R =, respectively. The iniu of Q ( ( B, θˆ z L % is attained for a unitary atrix B %, which jointly diagonalizes the estiated G covariance atrices of the pre-whitened sensor signals. An approxiate joint diagonalization algorith, offered by Flury and Gautschi [3], which iniizes (.4 w.r.t. B %, is applied in order to estiate the separation atrix. In siilar to the Jacobi ethod [], this algorith applies successive orthogonal rotations on each pair of distinct coluns of B %, where B % is constrained to be unitary. BSS using the FG ethod is denoted as the GFG algorith. In suary, solution of the BSS proble via orthogonal joint diagonalization coprises the following steps: Estiate the observations spatial whitening atrix Ŵ. Prewhiten the observation signals according to ( Estiate the distribution paraeters of the pre-whitened observation signals via the E algorith for G paraeter estiation []. If the G order is unnown, it ay be deterined using BIC [7], DL [8] or AIC [9]. 4 Estiate B % by applying the joint diagonalization algorith, offered by Flury and Gautschi [3], on the estiated G covariance atrices. 5 Estiate B according to (.34 by applying Bˆ = BW ˆ % ˆ..3 ORE SENSORS TAN SOURCES In this section, the case of nonsquare ixing atrix A, i.e. the case in which the nuber of rows, denoted by L, is greater than the nuber of coluns, denoted by K, is considered. The need for diension reduction of the observation signals is arisen by the fact that the ran of the L L covariance atrix of the observation signals, denoted by R, is K. The diension reduction procedure is perfored in the following steps: 8

27 Apply SVD of R, i.e. R = U. { R S { R U { R orthonoral diagonal orthonoral Delete the coluns of UR, which correspond to zero diagonal eleents of S R, and create U R ( L K. 3 Apply the linear transforation of xt = U R xt t =,..., T. After applying these steps, the ran of R = cov( x% is full ( ran( R = K. Due to diension reduction, t the generative odel of the observation signals, described in Subsection.., is extended. According to this extended odel, x is left ultiplied by an orthonoral K L decorrelation atrix, U, such that t = = = xt URxt URAst As t where A is a K K atrix. Therefore, according to (. and (.43 It is iplied by (.43 that by replacing A with ˆt = t = t, (.43 s A x Bx. (.44 A U A = R, the generation process of x t can be ebedded in the generative odel, described in Subsection... ence, in the sae anner described in Section.., it can be shown that the PDF of x t is G with nondiagonal covariance atrices, as expressed by t ( ( x% t t; = wφ( t; fx x θ x η, R. (.45 = The vector of unnown distribution paraeters of the observation signals after diension reduction is ( xt θ =, η, R, where Aµ = η and AC A. ence, Q ( B is derived in the w = denoted by { } sae anner as Q ( B, such that = R R Q ( ( ( ( ˆ ( ( ˆ x ˆ ( ˆ w B, θ = log det DIAG BR B log det BR B L =. (.46 The estiated ixing proportions and covariance atrices of the G of t { Rˆ } = w ˆ x are denoted by { } = and, respectively. ence, estiation of B can be perfored by utilizing the GJD or GFG algoriths, described above. According to (.44 ŝ t = Bx t. Since xt = U x t Therefore, estiation of B can be perfored in the following anner: R, then ŝ = BU x = Bx. t R t t 9

28 Bˆ ˆ BU. = R (.47.4 SIULATIONS In this section, the separation perforances of the GSVDJD, GPA, GFG, NIFA [5], JADE [3] and FastICA [6] algoriths are evaluated and copared by eans of interference-to-signal ratio (ISR. Calculation of ISR is detailed in Appendix D. This section is organized as follows. In Subsection.4., the expected separation perforances of the tested algoriths are evaluated using synthetic data versus sewness level of the sources, rotation angle of a unitary ixing atrix, source statistical distribution class, saple size, nuber of sources, and SNR level of the sensors. In each test, the expected separation perforances are easured via the averaged ISR, obtained by averaging ISR values corresponding to the sae set of ixtures. In Subsection.4., the perforances of the copared algoriths in separating ixtures of real speech signals are evaluated. The copared algoriths were operated under the following overall settings: G paraeter estiation in the GSVDJD, GPA and GFG algoriths was perfored via the greedy E algorith for G paraeter estiation []. In the greedy approach the high dependence of the E algorith on initialization is overcoe by optial insertion of ixture coponents one after another. The nuber of E iterations was set to 00; Separation perforance of the NIFA algoriths [5] was evaluated with 00 E iterations. In each axiization step, 00 iterations with learning rate factor (LRF of were used for updating the source distribution paraeters and the separation atrix coefficients; 3 A urtosis-based contrast function was used in the FastICA algorith [6]..4. SYNTETIC DATA.4.. EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF SKEWNESS LEVEL The following trial copared the expected separation perforances of the tested algoriths as a function of sewness levels of the source signals. Two source signals were synthesized by the following G PDF: 3 ( s ( = w Φ( fs s θ s µ C, where the univariate G orders of the first and second sources were t; t;, = 0

29 and 3, respectively. The values of the ixing proportions, ean vectors and covariance atrices were: w = + p, w = w3 = p (,, µ = [ 5] T, µ [ 0,0 ] T, = [ ] 0, = µ 3 0, 5 T, C = 5,, diag ( = diag ( 5, 7 and C 3 = diag 5,0, respectively. According to the G paraeters, one can notice that the first source is Gaussian and the second source is non-gaussian. The sewness [3] level was controlled by adjusting the value of p [ 0,0.5] according to the following C, forula: where ( p n 3 3 ( p ( ( 900 p 9 p 56 ( p ( p ( p ( p p 80 p 6795 p+ 35 γ = =, 3 3 ( p ( + (.48 denotes the n th oent of the PDF of the non-gaussian source signal, as a function of the adjustent paraeter p. For each sewness level, which was controlled by adjusting the value of p [ 0,0.5], 000 sets of source signals, containing T=000 saples each, were synthesized and ixed by a rando ixing atrix, with eleents drawn fro the real standard noral distribution. A scatter of an arbitrary realization of ixed sources with sewness level of -0.9 is depicted in Fig. 4.a. The copared algoriths were operated under the following settings: The G order in the GJD and GFG algoriths was set to 3; The PDFs of each unobserved source signal was odeled in the NIFA algorith by univariate Gs of order and 3. Fig. 4.b depicts the averaged ISR of each algorith versus the sewness level. It can be seen that the perforances of the JADE and FastICA algoriths are sewness-dependent due to the fact that the third order cuulant (i.e. sewness is not considered by these ethods. In contrast, the GSVDJD, GPA, GFG and NIFA algoriths, which apply flexible source distribution odeling, are sewness-independent.

30 a b Fig. 4. a Scatter plot of an arbitrary realization of ixed sources with sewness level of The ellipses represent the estiated covariance atrices. b The averaged ISR of the JADE, FastICA, GPA, GSVDJD, GFG and NIFA algoriths versus sewness level..4.. EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF ROTATION ANGLE The following trial copared the expected separation perforances of the tested algoriths as a function of the rotation angle of an orthonoral ixing atrix. Two source signals were synthesized by the sae G used in the first trial, where the value p was set to. The copared algoriths were operated 6 o under the sae settings of the first trial. For each rotation angle φ 0,80, 000 sets of source signals, cosφ sinφ containing T=000 saples each were synthesized and ixed by the rotation atrix A = sinφ cosφ. Fig. 5 depicts the behavior of the averaged ISR of each algorith versus the rotation angle. In contrast to the JADE, FastICA, GSVDJD, GPA and GFG algoriths, the perforance of the NIFA algorith is rotation-dependent due to its incapability of autoatically adapting the G order of each source to the rotation angle.

31 Fig. 5. The averaged ISR of the tested algoriths as a function of rotation angle EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF STATISTICAL DISTRIBUTION CLASS The following trial copared the expected separation perforances of the tested algoriths as a function of the statistical distribution class. The source densities were synthesized by a generalized Gaussian [], which has the for f ( s ( +β exp -0.5 s. By inferring the shape paraeter, β, a wide class of uniodal PDFs can be characterized including unifor, Gaussian, Laplacian, and other sub and super- Gaussian densities. For exaple, the unifor, noral and Laplacian distributions are derived by choosing for β, β = 0 and β =, respectively. The copared algoriths were operated under the following settings: The G order in the GJD and GFG algoriths was deterined according to BIC [7], where the axial order allowed was set to 9; The PDFs of each unobserved source signal was odeled in the NIFA algorith by univariate Gs of order 3; 3 For β > 0 (i.e. super-gaussian densities, the ean vectors estiated by the GSVDJD, GPA, GFG and NIFA algoriths were constrained to zero. For each β { 0,9,8,7,6,5,4,3,,,0,-0.99}, 00 sets of unit variance two-diensional source signals, containing T=5000 saples each, were synthesized. The sources were ixed by a rando ixing atrix, with eleents drawn fro the real standard noral distribution. Fig. 6.a depicts the behavior of the 3

32 averaged ISR of the tested algoriths versus β. Fig. 6.b, depicts the averaged nuber of Gaussians deterined by the GSVDJD, GPA and GFG algoriths according to the BIC, as a function of β. One can observe that the nuber of Gaussians increases while β increases fro 0 to 0. For β = 0, the perforances of all the copared algoriths are poor, due to the fact that the sources are Gaussian. As β increases fro to 0, one can observe that the separation perforances of the GSVDJD, GPA, GFG, JADE and FastICA algoriths iprove, while the GJD and GFG algoriths outperfor the JADE and FastICA algoriths. Regarding the NIFA algorith, one can observe that its perforance deteriorates while β > 3 and β > 6 for LRF=0.05 and LRF=0.005, respectively. The reason for that stes fro the fact that when β increases, the sources distribution tails becoe heavier and ore Gaussians (as observed in Fig. 6.b are required for odeling of the probability density of the data. Since the nuber of Gaussians ay be different in each direction and the NIFA algorith is incapable to adapt the nuber of Gaussians in each direction, odeling isatch is caused and perforance deterioration is inflicted. a b Fig. 6. a The averaged ISR of the tested algoriths versus the generalized Gaussian shape paraeter, β. b The averaged G order, deterined by the GJD and GFG algoriths according to the BIC, versus the generalized Gaussian shape paraeter, β EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF SAPLE SIZE The following trial copared the expected separation perforances and the averaged running tie of the tested algoriths as a function of the saple size, T. For each T = 50,00,500,000,3000,5000, 00 sets of two-diensional unit variance, independent source signals were synthesized fro the generalized Gaussian 4

33 density. The first and second sources were synthesized with shape paraeter of β = and β = 0, respectively. The sources were ixed by a rando ixing atrix, with eleents drawn fro the real standard noral distribution. Fig. 7.a depicts the behavior of the averaged ISR of the tested algoriths versus T. Fig. 7.b depicts the averaged nuber of Gaussians deterined by the GSVDJD, GPA and GFG algoriths according to the BIC, as a function of T. Fig. 7.c, depicts the averaged running tie of the tested algoriths versus the saple size. The coputer used for the siulations was IB R-5 laptop coputer with Intel centrino T processor. According to Fig. 7.a, one can observe that as T increases fro 50 to 5000 the separation perforances of the tested algoriths iprove, where the GSVDJD, GPA and GFG algoriths perfor better in coparison to the JADE, FastICA and NIFA algoriths. According to Fig. 7.b, one can observe that the nuber of Gaussians decreases while T decreases fro 5000 to 50. This property enables the applicability of the proposed ethods for sall saple size. According to Fig. 7.c, one can observe that the running tie of the NIFA, GSVDJD, GPA and GFG algoriths is uch higher in coparison to the JADE and FastICA algoriths. This drawbac stes fro the fact that the NIFA, GSVDJD, GPA and GFG ethods assue a uch ore abundant source distribution odel. owever, the averaged running tie of the NIFA, GSVDJD, GPA and GFG algoriths does not increase draatically with the saple size. 5

34 a b c Fig. 7. a The averaged ISR of the tested algoriths versus the saple size, b The averaged G order, deterined by the GJD and GFG algoriths according to the BIC, versus the saple size, c The averaged running tie of the tested algoriths as a function of the saple size EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF DIENSION The following trial copared the separation perforances and the averaged running tie of the tested algoriths versus the nuber of sources. Let K denote the nuber of sources. For each K {,4,6,8,0} 00 sets of K unit variance source signals, containing T=5000 saples each, were synthesized fro the generalized Gaussian density with shape paraeter β = 0. The source signals in each set were ixed by a rando K K ixing atrix, with eleents drawn fro the real standard noral distribution. 6

35 The copared algoriths were operated under the following settings: The G order in the GSVDJD, GPA and GFG algoriths was deterined according to BIC [7]. The axial G order for two-diensional ixtures was set to 9. For ixtures with diension greater than two, the axial G order was set to 30. It is noted that by fixing the axial G order, a odel with statistical dependence between the sources ight be iposed due to the fact that soe of the cobinations between the univariate Gaussians are discarded. This fact ight theoretically influence (in extree cases the results; The ean vectors estiated by the GSVDJD, GPA, GFG and NIFA algoriths were constrained to zero; 3 The PDFs of each unobserved source signal was odeled in the NIFA algorith by a univariate G of order 3. Fig. 8.a depicts the averaged ISR of each algorith as a function of K. One can observe that the separation perforances of the GJD and GSVD are better in coparison to the JADE, FastICA and NIFA algoriths. Fig. 8.b depicts the averaged nuber of Gaussians, deterined by the GSVDJD, GPA and GFG algoriths according to BIC. One can observe that the nuber of Gaussians increases while the nuber of sources increases fro to 4 and decreases while the nuber of sources increases fro 4 to 0. The reason for the decrease stes fro the central liit theore, according to which the distribution of the sensors becoes ore Gaussian when the nuber of sources increases. Fig. 8.c, depicts the averaged running tie of the tested algoriths as a function of K. According to this figure, one can observe that the running tie of the NIFA, GSVDJD, GPA and GFG algoriths is uch higher in coparison to the JADE and FastICA algoriths. owever, the averaged running tie of the NIFA, GSVDJD, GPA and GFG algoriths does not increase draatically with the nuber of sources. 7

36 a b c Fig. 8. a The averaged ISR of tested algoriths as a function of the nuber of sources. b The averaged G order, deterined by the GJD and GFG algoriths according to the BIC, as a function of the nuber of source signals. c The averaged running tie of the tested algoriths as a function of the nuber of sources EXPECTED SEPARATION PERFORANCES AS A FUNCTION OF SNR The following trial copared the separation perforances of the tested algoriths in the presence of additive white Gaussian noise. For each SNR, 00 sets of two-diensional unit variance independent source signals, containing T=5000 saples each, were synthesized fro the generalized Gaussian density. The first and second sources were synthesized with shape paraeter β = and β = 0, respectively. For each set, the observation signals were derived according to the following linear noisy ixing odel: xt = Ast + σn t t =,..., T, (.49 8

37 where A is a rando and n ixing atrix, with eleents drawn fro the real standard noral distribution denotes an isotropic zero-ean Gaussian noise with identity covariance atrix. The value controlled the SNR levels according to the following forula: K ( AA T - σ = tr 0 SNR /0. σ (.50 The copared algoriths were operated under the following settings: The G order in the GSVDJD, GPA and GFG algoriths was deterined according to the BIC [7], where the axial G order was set to 9; The ean vectors estiated by the GJD, GFG and NIFA algoriths were enforced to zero; 3 The PDF of each unobserved source signal was odeled in the NIFA algorith by a univariate G of order 3. Fig. 9 depicts the behavior of the averaged ISR of the tested algoriths versus SNR. One can observe that the GJD and GFG algoriths outperfor the JADE, FastICA and NIFA algoriths for high SNRs. The perforance of the NIFA algorith for LRF=0.05 and LRF=0.005 deteriorates while the SNR increases fro 0 db and fro 30 db, respectively. The reason for that stes fro the incapability of the NIFA algorith to adapt the nuber of Gaussians in each direction, which ay be different. This incapability causes odeling isatch which inflicts perforance deterioration. For low SNRs the odeling isatch is obscured by the presence of noise and therefore is not doinant. Fig. 9. The averaged ISR of the tested algoriths versus SNR. 9

38 .4. REAL DATA The following trials copare the perforances of the tested algoriths, by eans of ISR, in separating different ixture cobinations of three 0 seconds long speech signals, sapled at 6 z and noralized to unit variance. The source signals are denoted by, S and S. S 3 The copared algoriths were operated under the following settings: The G order in the GSVDJD, GPA and GFG algoriths was deterined according to BIC [7], where the axial G orders allowed for and 3 diensions were 9 and 30, respectively; The ean vectors estiated by the GSVDJD, GPA, GFG and NIFA algoriths were constrained to zero; 3 The PDFs of each unobserved source signal was odeled in the NIFA algorith by a univariate G of order 3. In the first trial, S 5 3 and S were ixed by the ixing atrix A = 7 8. The scatter of the ixed sources is depicted in Fig. 9. The optial G order, deterined by the GSVDJD, GPA and GFG algoriths was In the second trial, S, S and S 3 were ixed by the ixing atrix A = The optial G orders, deterined by the GSVDJD, GPA and GFG algoriths were 7, 7 and 6, respectively. In the third trial, the case of ore sensors than sources was tested by ixing, and S with the ixing atrix A = S S 3. The optial G orders, deterined by the GSVDJD, GPA and GFG algoriths were 7, 7 and 6, respectively. Separation perforances of the copared algoriths are depicted in Fig.. One can observe that the best separation perforance was achieved by the GSVDJD algorith. T 30

39 Fig. 0. Scatter plot of the ixture of S and S. The ellipses represent the estiated covariance atrices, which asseble the G of the observation signals. 3

40 a b c Fig.. Separation perforances of the tested algoriths in separating a two-diensional ixture of two speech signals, b three diensional ixture of three speech signals, c eight-diensional ixture of three speech signals. 3

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used