REPORT DOCUMENTATION PAGE

Size: px
Start display at page:

Download "REPORT DOCUMENTATION PAGE"

Transcription

1 REPORT DOCMENTATION PAGE Form Approved OMB No Public reportin burden for this collection of information is estimated to averae hour per response, includin the time for reviewin instructions, searchin data sources, atherin and maintainin the data needed, and completin and reviewin the collection of information. Send comments reardin this burden estimate or any other aspect of this collection of information, includin suestions for reducin this burden to Washinton Headquarters Service, Directorate for Information Operations and Reports, 5 Jefferson Davis Hihway, Suite 4, Arlinton, VA -43, and to the Office of Manaement and Budet, Paperwork Reduction Project (74-88) Washinton, DC 53. PLEASE DO NOT RETRN YOR FORM TO THE ABOVE ADDRESS.. REPORT DATE (DD-MM-YYYY) MARCH 4. TITLE AND SBTITLE. REPORT TYPE Conference Paper (PREPRINT) INTEGRATED FEATRE NORMALIZATION AND ENHANCEMENT FOR ROBST SPEAKER RECOGNITION SING ACOSTIC FACTOR ANALYSIS (PREPRINT) 3. DATES COVERED (From - To) 5a. CONTRACT NMBER FA875-9-C-67 5b. GRANT NMBER 5c. PROGRAM ELEMENT NMBER 35885G 6. ATHOR(S) 5d. PROJECT NMBER 388 Taufiq Hasan, and John H. L. Hansen 5e. TASK NMBER BA 5f. WORK NIT NMBER AE 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Research Associates for Subcontractor: niversity of Texas at Dallas REPORT NMBER Defense Conversion, Inc. 8 W Campbell Rd Hillside Terrace Richardson, TX 758 Marcy NY SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) Air Force Research Laboratory/Information Directorate Rome Research Site/RIGC 55 Brooks Road Rome NY SPONSOR/MONITOR'S ACRONYM(S). SPONSORING/MONITORING AGENCY REPORT NMBER AFRL-RI-RS-TP--4. DISTRIBTION AVAILABILITY STATEMENT Approved for public release; distribution unlimited. PA# 88ABW--89 Cleared Date: 9 March 3. SPPLEMENTARY NOTES This paper was accepted for publication in the Proceedins of the Interspeech Conference, Portland, Oreon, 9-3 Sept-. This work was funded in whole or in part by Department of the Air Force contract number FA875-9-C- 67. The.S. Government has for itself and others actin on its behalf an unlimited, paid-up, nonexclusive, irrevocable worldwide license to use, modify, reproduce, release, perform, display, or disclose the work by or on behalf of the Government. All other rihts are reserved by the copyriht owner. 4. ABSTRACT State-of-the-art factor analysis based channel compensation methods for speaker reconition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture of dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis (AFA) transformation performs feature dimensionality reduction, decorrelation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener ain on the acoustic feature eienvector directions, and is similar to the sinal sub-space based speech enhancement schemes. We also propose several methods of adaptively selectin the AFA parameter for each mixture. The proposed feature transformation is applied usin a probabilistic mixture alinment, and is interated with a conventional i-vector system. Experimental results on the telephone trials of the NIST SRE demonstrate the effectiveness of the proposed scheme. 5. SBJECT TERMS Channel Normalization, Factor Analysis, Tactical SIGINT Technoloy, Gaussian Modelin 6. SECRITY CLASSIFICATION OF: 7. LIMITATION OF ABSTRACT a. REPORT b. ABSTRACT c. THIS PAGE 8. NMBER OF PAGES 5 9a. NAME OF RESPONSIBLE PERSON JOHN G. PARKER 9b. TELEPHONE NMBER (Include area code) Standard Form 98 (Rev. 8-98) Prescribed by ANSI Std. Z39.8

2 Interated Feature Normalization and Enhancement for robust Speaker Reconition usin Acoustic Factor Analysis Taufiq Hasan and John H. L. Hansen Center for Robust Speech Systems (CRSS), Eric Jonsson School of Enineerin, niversity of Texas at Dallas, Richardson, Texas,.S.A. {Taufiq.Hasan, Abstract State-of-the-art factor analysis based channel compensation methods for speaker reconition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis (AFA) transformation performs feature dimensionality reduction, de-correlation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener ain on the acoustic feature eienvector directions, and is similar to the sinal sub-space based speech enhancement schemes. We also propose several methods of adaptively selectin the AFA parameter for each mixture. The proposed feature transform is applied usin a probabilistic mixture alinment, and is interated with a conventional i-vector system. Experimental results on the telephone trials of the NIST SRE demonstrate the effectiveness of the proposed scheme.. Introduction Factor analysis based channel compensation methods for speaker reconition are based on the assumption that, speaker/utterance dependent adapted GMM [] mean super-vectors can be constrained to lie in a lower dimensional subspace [ 4]. Lower dimensional speaker and channel dependent subspaces assumption inspired various channel compensation schemes such as, Eienvoice [], Eienchannel [3] and Joint Factor Analysis (JFA) [3]. With the introduction of i-vectors, which are the latent factors of the so called total variability space [4], research trend shifted towards directly applyin compensation techniques on these lower dimensional utterance level features, enablin the development of fully Bayesian techniques [5, 6]. While super-vector domain factor analysis techniques and its derivatives are effective, they do not consider the fact that acoustic feature vectors can also be constrained in a lower dimensional subspace of the feature space. This is clear, since a low-rank modelin of the super-covariance matrix obtained from different utterance GMMs is not equivalent to a low-rank assumption of the acoustic feature covariance matrix. Lower dimensional representation of speech short-time spectrum is a well established phenomenon which motivated a family of speech enhancement methods known as the sinal subspace approach [7]. This phenomenon This project was funded by AFRL under contract FA (Approved for public release, distribution unlimited: 88ABW-- 8), and partially by the niversity of Texas at Dallas from the Distinuished niversity Chair in Telecommunications Enineerin held by J.H.L. Hansen. is also found to be valid for popular acoustic features, such as Melfrequency Cepstral Coefficients (MFCC) [8], even thouh these features are processed by Discrete Cosine Transform (DCT) for de-correlation. To illustrate that acoustic feature covariance matrices have close to zero eienvalues, and can be assumed low-rank, we train a 4 mixture full-covariance GMM niversal Backround Model (BM) usin 6 dimensional MFCC features on a lare development data set. For a typical mixture of this BM, the covariance matrix is plotted as an intensity imae in Fi. (a), revealin that the non-diaonal components are indeed sinificant. Fi. (b) shows the sorted eienvalues of three different mixtures showin how the enery is compacted in the first few coefficients, while the later ones are close to zero. Also, it is known that the first few eien-directions of the feature covariance matrix are relatively more speaker dependent [8], further justifyin the low-rank assumption of features for a speaker reconition task. Inspired by the abovementioned observations, in this study, we propose an acoustic factor analysis [9] scheme for speaker reconition and develop a mixture-dependent feature dimensionality reduction transform. The proposed transformation performs dimensionality reduction, de-correlation, feature variance normalization, and enhancement, at once. Also, instead of hard feature alinment to a specific mixture, applyin the transformation, and then retrainin the BM, we use a probabilistic frame alinment and transform the BM parameters within the system. Interatin the proposed method within a standard i-vector system provides sinificant ain in system performance.. Acoustic Factor Analysis In this section, we describe the proposed factor analysis model of acoustic features, discuss its formulation, mixture-wise application for dimensionality reduction, advantaes and properties... Formulation Let X = {x n n = N} be the collection of all acoustic feature vectors from the development set. sin a factor analysis model, a d feature vector x X can be represented by, x = Wy + µ + c. () Here, W is a d q low rank factor loadin matrix that represents q < d bases spannin the subspace with important variability in the feature space, and µ is the d mean vector of x. We denote the latent variable vector y N (, I) as acoustic factors, which is of dimension q. We assume the remainin noise component c N (, σ I) to be isotropic and the model is thus equivalent to Probabilistic Principal Component Analysis (PPCA) []. The advantae of this model is that the acoustic factors y, explains the correlation between the feature coefficients x, which we believe are more speaker dependent [8], while the noise component c incorporates the residual variance of the data. It should be em- Details on feature extraction and BM data are iven in Sec. 4..

3 6.3 4 Typical mixture (Enery compaction 68.3%) Low compact mixture (Enery compaction 8.67%) Eienvalue 8 6 Hih compact mixture (Enery compaction 53.3%) Occurance probability Static Sorted Index Top posterior probability, max [p( x n )] (a) (b) (c) Fiure : (a) An intensity imae plot showin a typical covariance matrix of a full-covariance BM (darker indicate lower values) trained on 6-dimensional MFCC features ( static+ + ). (b) Sorted eienvalues of three covariance matrices from the BM showin different enery compaction in various mixtures. Enery compaction is the percentae of top eienvectors that account for 95% of total enery. A typical, low and hih compact mixture eienvalues are shown. (c) Distribution of top BM mixture posterior probability values for development features revealin that frame alinment is not always definitive. phasized that even thouh we denote the term c as noise, when x represent cepstral features c actually represent convolutional channel distortion. sin a mixture of these models [], we have, p(x) = w p(x ) where () p(x ) = N (µ, σ I + W W T ). (3) Here, µ, w, W and σ denote the mean vector, mixture weiht, factor loadins, and noise variance for the -th AFA mixture... Mixture dependent transformation One advantae of usin a mixture of PPCA models for acoustic factor analysis is that its parameters can be conveniently extracted from a full-covariance GMM-BM trained usin the Expectation- Maximization (EM) alorithm []. The procedure is iven below:... niversal Backround Model The BM model Λ, trained on the dataset X is iven by, p(x Λ ) = M w N (µ, Σ ) (4) = where w are the mixture weihts, M is the total number of mixtures, µ are the mean vectors and Σ are the full covariance matrices. Here, µ and w are the same as in () and (3).... Noise subspace selection The AFA parameter q in () defines the number of principal axes to retain, assumin the lower (d q) directions spans the noiseonly subspace [7]. The maximum likelihood estimate of the noise variance σ for the -th mixture is iven by, σ = d q d i=q+ λ,i (5) where λ,q+ λ,d are the d q smallest eienvalues of Σ. We note that q can be different for each mixture, and thus in the later sections, we denote it by q()...3. Compute the factor loadin matrix The maximum likelihood estimation of the factor loadin matrix W of the -th mixture of the AFA model in () is iven by [], W = q (Λ q σ I) / R (6) where q is a d q matrix whose columns are the q leadin eienvectors of Σ, Λ q is a diaonal matrix containin the correspondin q eienvalues, and R is a q q arbitrary orthoonal rotation matrix. In this work, we set R = I...4. The AFA transformation The posterior mean of the acoustic factors y n can be used as the transformed and dimensionality reduced version of x n for the -th component of the AFA model. This can be shown to be []: where E{y n x n, } = y n x n, = A T (x n µ i) z n, (7) A = W M T and (8) M = σ I + W T W. (9) The matrix A is termed the -th AFA transform. Here, the oriinal feature vectors x n are replaced by the mixture dependent transformed vectors z n,. It can be easily shown that z n, is normally distributed with a zero mean and diaonal covariance matrix Σ z = I σλ q, demonstratin that A performs mean normalization and de-correlation. Conventionally, a feature vector x n is alined with a mixture that yields the hihest posterior probability p( x n, Λ ), and the correspondin transformation is applied for dimensionality reduction []. However, as we observe the distribution of max p( x n, Λ ) for our development data in Fi. (c), features are alined with multiple Gaussians in most cases providin values of max p( x n, Λ ).5. Thus, we propose to apply the AFA transform usin a probabilistic alinment and transform the BM instead of retrainin it. The new BM is iven by, M p(z ˆΛ ) = w N (, Σ z ) () i=.3. Feature enhancement in AFA Expansion of the transformation matrix A in (8) unveils the builtin enhancement operation it performs. Substitutin the expressions of W and M from (6) and (9) in (8), we have: A T = Λ q (Λq σ I) T / T q T where we utilized the fact that, q diaonal ain matrix iven by: = Λq q G q T () = I and introduced a G = Λ q (Λ q σ I) T /. ()

4 GAIN Square-root Wiener Gain ( ξ ξ+ ) / SNR, ξ [db] Wiener Gain ξ ξ+ Fiure : Input SNR [db] (ξ) vs. Wiener ains. Wiener and square-root Wiener ains are shown with a solid (-) and dashed (- -) line, respectively. 3. AFA interated i-vector system In this section, we describe how the proposed method could be incorporated into the current state-of-the-art i-vector system framework [4]. First, a full covariance BM model, Λ iven by (4), is trained on the development data vectors. Next, the AFA dimension q() for each mixture is set, and the noise variance σ is computed usin (5). The factor loadin matrix W and transformation matrix A are then computed usin (6) and (8), respectively. For each development utterance s, the zero order statistics is iven by, N s() = n s γ (n) where γ (n) = p( x n, Λ ) (5) Keepin aside the term Λ q in (), we observe that the operation in (7) first computes the inner product of the mean normalized acoustic feature with the q principal eienvectors of Σ, then for each i-th eienvector direction, applies the ain function specified by the i-th diaonal of G in (), which is actually a square-root Weiner ain function []. Definin the classic speech enhancement terminoloy a priori SNR ξ as [7] ξ = (λ,i σ)/σ, and usin this to express the ain equations, we plot the ain functions aainst ξ in Fi.. Thus, in addition to de-correlation and dimensionality reduction, the transform A also performs feature enhancement, assumin a noise variance σ. It maybe noted that conventional factor analysis techniques in super-vector space are also known to be representable usin similar Wiener like ain functions as discussed in []..4. Feature variance normalization in AFA The term Λ q in () normalizes the variance of the acoustic feature stream in the i-th eien-direction, since λ,i is the expected feature variance alon this direction [3]. The AFA transformation performs this normalization in each mixture in addition to the enhancement mentioned in the previous section. This process is interestinly similar to the cepstral variance normalization frequently performed in the front-end except for its domain of operation..5. Adaptive AFA dimension selection Since the distribution of eienvalues are different in each mixture, a unique value of q() should be suitable in each case. This is illustrated in Fi. (b), where eienvalues of three different mixture covariance matrices are shown. Typically we see an enery compaction of 7%, that is, the first 7% eienvalues account for 95% of the total enery. But in other cases, the enery compaction can be very low or hih, demandin that the AFA retained dimension to be low or hih, respectively. Motivated by this observation, we develop a simple method of selectin the AFA dimension, q(). We first set the enery compaction ratio E close to.9.97, then compute the sorted eienvalues λ i, of Σ for each mixture. The optimal dimension retainin E% enery is then calculated as: q E() = min q q i= λ,i s.t. d > E. (3) i= λ,i This method is denoted by AFA-Var-En. As an alternate method, we use the effective rank estimation alorithm in [4], enerally used for noise estimation in matrices, to select the AFA dimension. A threshold δ [, ] is set and the AFA dimension is obtained by: ( ) q / i= q δ () = min s.t. λ,i q d > δ. (4) i= λ,i We denote this method as AFA-Var-Rk. When the AFA dimension fixed for all mixtures, we denote the system by AFA-Fix. followin the standard procedure [,4]. However, for AFA, the first order statistics ˆF s() is extracted usin the transformed features in the correspondin mixtures instead of the oriinal features. γ (n)z n, = A T γ (n)(x n µ ) (6) ˆF s() = n s n s For estimatin the total variability (TV) matrix, the standard procedure is followed [4] usin the new BM ˆΛ iven in (), and statistics ˆF s() and N s(). It should be noted that, in this case the super-vector dimension reduces to K = M = q() from Md, and TV matrix size becomes K R. We define super-vector compression ratio α = K/Md, measurin overall AFA compaction. 4. Experiments and Results We perform our experiments on the male trials of NIST SRE telephone train/test condition (condition 5, normal vocal effort). 4.. System Description For voice activity detection, a phoneme reconizer [5] combined with an enery based scheme is used. 6-dimension feature vectors (9 MFCC +Enery + + ) are extracted, usin a 5 ms window with ms shift and Gaussianized usin a 3-s slidin window. Gender dependent full and diaonal covariance BMs with 4 mixtures are trained on utterances selected from Switchboard II Phase and 3, Switchboard Cellular Part and, and the NIST 4, 5, 6 SRE enrollment data. For the TV matrix trainin, the same dataset is utilized. 4 dimensional i-vectors are extracted, whitened and then lenth normalized [5]. For session variability compensation and scorin we use a Gaussian Probabilistic Linear Discriminant Analysis (PLDA) scheme with a fullcovariance noise model [5]. 4.. Results usin AFA transformation We compare several AFA systems with our diaonal and fullcovariance BM based i-vector systems, denoted by baseline dia-cov and baseline full-cov, respectively. The followin AFA systems are built: AFA-Fix with q() = 4, 48 and 54, AFA-Var-En with E =.9,.95 and.97, and AFA-Var-Rk with δ =.99,.995 and.997. In this experiment, we fix the PLDA eienvoice size N EV to. The results are shown in Table. From the results, we observe that the AFA-Var-En system in eneral outperforms the basic AFA-Fix, even with similar compression ratio α. The best results are obtained for AFA-Var-En (E =.97) with an EER of.76% obtainin a 5.37% relative improvement from Baseline full-cov. The AFA-Var-Rk (δ =.997) method provided the best DCF old of These results prove that the proposed AFA transformation is successfully able to reduce some nuisance directions in the feature space producin i-vectors with better speaker discriminatin ability. We also observe that AFA i-vector systems are computationally less expensive compared to the baseline system, rouhly by a factor of α. 3

5 Table : Comparison between baseline i-vector and AFA systems with respect to %EER, DCF old and DCF new for N EV =. Percent relative improvement (%r) and super-vector compression ratio (α) are also shown. System α %EER/%r DCF old/%r DCF new/%r Baseline full-cov q = /.4.3/.57.44/3.46 AFA-Fix q = 48.8./.5./ /. q = /5.96./ /3.9 E = /-.56.3/ /6.7 AFA-Var-En E = /../.96.4/.55 E = /5.37./9.4.4/.8 δ = /-.3.3/.7.4/8.35 AFA-Var-Rk δ = /6.9./5.6.4/7.39 δ = /.95./.56.38/6.49 Table : Linear score fusion of baseline and AFA systems Individual system performances System N EV %EER DCF old DCF new (i) Baseline full-cov (ii) Baseline dia-cov (iii) AFA-Var-En (.97) (iv) AFA-Fix (48) Fusion system performances Fusion of (i) & (iii) Fusion of (i) & (iv) Fusion of (i) - (iii) This is expected since the computational complexity of an i-vector system is proportional to the super-vector size [6] Fusion of multiple systems We pick four of our systems for fusion: (i) Baseline full-cov, (ii) Baseline dia-cov, (iii) AFA-Var-En (E =.97), and (iv) AFA- Fix (q = 48). The PLDA N EV parameter was set to for systems (i)-(iii) and 5 for system (iv). Simple linear fusion was used with mean and variance normalization of scores to (, ) for calibration. From the results presented in Table, fusion performance of (i) and (iii) clearly reveals that AFA and baseline fullcov systems have complementary information, as the EER and DCF values improve. The best result is achieved by fusin systems (i), (ii) and (iii), to obtain: EER =.87%, DCF old =.993 and DCF new = Performance comparison of the systems (i), (ii) and the fusion is shown in Fi. 3 usin Detection Error Trade-off (DET) curves. Here, aain we observe the superiority of the proposed AFA-Var-En system compared to the baseline especially in the low false alarm reion, whereas the fusion system shows better performance in the full DET curve rane. 5. Conclusions In this study, we have proposed a factor analysis model for acoustic features to compensate for transmission channel mismatch in speaker reconition. sin the model, we have developed a mixture dependent feature transform that performs dimensionality reduction, de-correlation, variance normalization and enhancement, at once. Instead of a separate front-end processin, the proposed transform has been interated within an i-vector speaker reconition framework usin a probabilistic feature alinment technique. Experimental results have demonstrated the superiority of the proposed scheme compared to the baseline i-vector system. 6. References [] D. Reynolds, T. Quatieri, and R. Dunn, Speaker verification usin adapted Gaussian mixture models, Diital Sinal Process., vol., no. 3, pp. 9 4,. [] P. Kenny, G. Boulianne, and P. Dumouchel, Eienvoice modelin with sparse trainin data, IEEE Trans. Speech and Audio Process., vol. 3, no. 3, pp , May 5. Miss probability (%) AFA-Var-En (.97) Baseline Full-cov Fusion False Alarm probability (%) Fiure 3: DET curves showin individual subsystems: AFA-Var-En (E =.97), Baseline full-cov, and Fusion of (i)-(iii) shown in Table. [3] P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, Joint factor analysis versus Eienchannels in speaker reconition, IEEE Trans. Audio, Speech, and Lan. Process., vol. 5, no. 4, pp , May 7. [4] N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Frontend factor analysis for speaker verification, IEEE Trans. Audio, Speech, and Lan. Process., vol. 9, no. 99, pp , May. [5] D. Garcia-Romero and C. Y. Espy-Wilson, Analysis of i-vector lenth normalization in speaker reconition systems, in Proc. Interspeech, Florence, Italy, Oct., pp [6] P. Matejka, O. Glembek, F. Castaldo, M. Alam, O. Plchot, P. Kenny, L. Buret, and J. Cernocky, Full-covariance BM and heavy-tailed PLDA in i-vector speaker verification, in Proc. ICASSP, Florence, Italy, Oct., pp [7] Y. Ephraim and H. Van Trees, A sinal subspace approach for speech enhancement, IEEE Trans. Speech and Audio Process., vol. 3, no. 4, pp. 5 66, Jul 995. [8] B. Zhou and J. H. L. Hansen, Rapid discriminative acoustic model based on Eienspace mappin for fast speaker adaptation, IEEE Trans. Speech and Audio Process., vol. 3, no. 4, pp , July 5. [9] T. Hasan and J. H. L. Hansen, Factor analysis of acoustic features usin a mixture of probabilistic principal component analyzers for robust speaker verification, in Proc. Odyssey, Sinapore, June. [] M. Tippin and C. Bishop, Mixtures of probabilistic principal component analyzers, Neural computation, vol., no., pp , 999. [] S. Vasehi, Advanced sinal processin and diital noise reduction. Wiley, 996. [] A. McCree, D. Sturim, and D. Reynolds, A new perspective on GMM subspace compensation based on PPCA and wiener filterin, in Proc. Interspeech, Florence, Italy, Oct., pp [3] T. Anderson, Asymptotic theory for principal component analysis, The Annals of Mathematical Statistics, vol. 34, no., pp. 48, 963. [4] J. Cadzow, SVD representation of unitarily invariant matrices, IEEE Trans. Acoustics, Speech and Sinal Process., vol. 3, no. 3, pp. 5 56, Jun 984. [5] P. Schwarz, P. Matejka, and J. Cernocky, Hierarchical structures of neural networks for phoneme reconition, in Proc. ICASSP, vol., May 6. [6] O. Glembek, L. Buret, P. Matejka, M. Karafiat, and P. Kenny, Simplification and optimization of i-vector extraction, in Proc. ICASSP, Florence, Italy, Oct., pp

Support Vector Machines using GMM Supervectors for Speaker Verification

Support Vector Machines using GMM Supervectors for Speaker Verification 1 Support Vector Machines using GMM Supervectors for Speaker Verification W. M. Campbell, D. E. Sturim, D. A. Reynolds MIT Lincoln Laboratory 244 Wood Street Lexington, MA 02420 Corresponding author e-mail:

More information

Front-End Factor Analysis For Speaker Verification

Front-End Factor Analysis For Speaker Verification IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Front-End Factor Analysis For Speaker Verification Najim Dehak, Patrick Kenny, Réda Dehak, Pierre Dumouchel, and Pierre Ouellet, Abstract This

More information

Usually the estimation of the partition function is intractable and it becomes exponentially hard when the complexity of the model increases. However,

Usually the estimation of the partition function is intractable and it becomes exponentially hard when the complexity of the model increases. However, Odyssey 2012 The Speaker and Language Recognition Workshop 25-28 June 2012, Singapore First attempt of Boltzmann Machines for Speaker Verification Mohammed Senoussaoui 1,2, Najim Dehak 3, Patrick Kenny

More information

Improving the Effectiveness of Speaker Verification Domain Adaptation With Inadequate In-Domain Data

Improving the Effectiveness of Speaker Verification Domain Adaptation With Inadequate In-Domain Data Distribution A: Public Release Improving the Effectiveness of Speaker Verification Domain Adaptation With Inadequate In-Domain Data Bengt J. Borgström Elliot Singer Douglas Reynolds and Omid Sadjadi 2

More information

Multiclass Discriminative Training of i-vector Language Recognition

Multiclass Discriminative Training of i-vector Language Recognition Odyssey 214: The Speaker and Language Recognition Workshop 16-19 June 214, Joensuu, Finland Multiclass Discriminative Training of i-vector Language Recognition Alan McCree Human Language Technology Center

More information

Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System

Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System INERSPEECH 2015 Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System QingYang Hong 1, Lin Li 1, Ming Li 2, Ling Huang 1, Lihong Wan 1, Jun Zhang 1

More information

Speaker recognition by means of Deep Belief Networks

Speaker recognition by means of Deep Belief Networks Speaker recognition by means of Deep Belief Networks Vasileios Vasilakakis, Sandro Cumani, Pietro Laface, Politecnico di Torino, Italy {first.lastname}@polito.it 1. Abstract Most state of the art speaker

More information

Speaker Verification Using Accumulative Vectors with Support Vector Machines

Speaker Verification Using Accumulative Vectors with Support Vector Machines Speaker Verification Using Accumulative Vectors with Support Vector Machines Manuel Aguado Martínez, Gabriel Hernández-Sierra, and José Ramón Calvo de Lara Advanced Technologies Application Center, Havana,

More information

Novel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors

Novel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors Published in Ninth International Conference on Advances in Pattern Recognition (ICAPR-2017), Bangalore, India Novel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors

More information

A Small Footprint i-vector Extractor

A Small Footprint i-vector Extractor A Small Footprint i-vector Extractor Patrick Kenny Odyssey Speaker and Language Recognition Workshop June 25, 2012 1 / 25 Patrick Kenny A Small Footprint i-vector Extractor Outline Introduction Review

More information

SCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS

SCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS SCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS Marcel Katz, Martin Schafföner, Sven E. Krüger, Andreas Wendemuth IESK-Cognitive Systems University

More information

An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification

An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification Odyssey 2014: The Speaker and Language Recognition Workshop 16-19 June 2014, Joensuu, Finland An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification Jinghua Zhong 1, Weiwu

More information

Gain Compensation for Fast I-Vector Extraction over Short Duration

Gain Compensation for Fast I-Vector Extraction over Short Duration INTERSPEECH 27 August 2 24, 27, Stockholm, Sweden Gain Compensation for Fast I-Vector Extraction over Short Duration Kong Aik Lee and Haizhou Li 2 Institute for Infocomm Research I 2 R), A STAR, Singapore

More information

Session Variability Compensation in Automatic Speaker Recognition

Session Variability Compensation in Automatic Speaker Recognition Session Variability Compensation in Automatic Speaker Recognition Javier González Domínguez VII Jornadas MAVIR Universidad Autónoma de Madrid November 2012 Outline 1. The Inter-session Variability Problem

More information

Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication

Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication Aleksandr Sizov 1, Kong Aik Lee, Tomi Kinnunen 1 1 School of Computing, University of Eastern Finland, Finland Institute

More information

Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems

Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems Chin-Hung Sit 1, Man-Wai Mak 1, and Sun-Yuan Kung 2 1 Center for Multimedia Signal Processing Dept. of

More information

An Invariance Property of the Generalized Likelihood Ratio Test

An Invariance Property of the Generalized Likelihood Ratio Test 352 IEEE SIGNAL PROCESSING LETTERS, VOL. 10, NO. 12, DECEMBER 2003 An Invariance Property of the Generalized Likelihood Ratio Test Steven M. Kay, Fellow, IEEE, and Joseph R. Gabriel, Member, IEEE Abstract

More information

How to Deal with Multiple-Targets in Speaker Identification Systems?

How to Deal with Multiple-Targets in Speaker Identification Systems? How to Deal with Multiple-Targets in Speaker Identification Systems? Yaniv Zigel and Moshe Wasserblat ICE Systems Ltd., Audio Analysis Group, P.O.B. 690 Ra anana 4307, Israel yanivz@nice.com Abstract In

More information

Minimax i-vector extractor for short duration speaker verification

Minimax i-vector extractor for short duration speaker verification Minimax i-vector extractor for short duration speaker verification Ville Hautamäki 1,2, You-Chi Cheng 2, Padmanabhan Rajan 1, Chin-Hui Lee 2 1 School of Computing, University of Eastern Finl, Finl 2 ECE,

More information

Joint Factor Analysis for Speaker Verification

Joint Factor Analysis for Speaker Verification Joint Factor Analysis for Speaker Verification Mengke HU ASPITRG Group, ECE Department Drexel University mengke.hu@gmail.com October 12, 2012 1/37 Outline 1 Speaker Verification Baseline System Session

More information

SPEECH recognition systems based on hidden Markov

SPEECH recognition systems based on hidden Markov IEEE SIGNAL PROCESSING LETTERS, VOL. X, NO. X, 2014 1 Probabilistic Linear Discriminant Analysis for Acoustic Modelling Liang Lu, Member, IEEE and Steve Renals, Fellow, IEEE Abstract In this letter, we

More information

EFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION. Taufiq Hasan Al Banna

EFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION. Taufiq Hasan Al Banna EFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION by Taufiq Hasan Al Banna APPROVED BY SUPERVISORY COMMITTEE: Dr. John H. L. Hansen, Chair Dr. Carlos Busso Dr. Hlaing Minn Dr. P. K. Rajasekaran

More information

Use of Wijsman's Theorem for the Ratio of Maximal Invariant Densities in Signal Detection Applications

Use of Wijsman's Theorem for the Ratio of Maximal Invariant Densities in Signal Detection Applications Use of Wijsman's Theorem for the Ratio of Maximal Invariant Densities in Signal Detection Applications Joseph R. Gabriel Naval Undersea Warfare Center Newport, Rl 02841 Steven M. Kay University of Rhode

More information

Uncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition

Uncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition Uncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition Patrick Kenny, Themos Stafylakis, Md. Jahangir Alam and Marcel Kockmann Odyssey Speaker and Language Recognition Workshop

More information

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBER

7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBER REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

Report Documentation Page

Report Documentation Page Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

A Bound on Mean-Square Estimation Error Accounting for System Model Mismatch

A Bound on Mean-Square Estimation Error Accounting for System Model Mismatch A Bound on Mean-Square Estimation Error Accounting for System Model Mismatch Wen Xu RD Instruments phone: 858-689-8682 email: wxu@rdinstruments.com Christ D. Richmond MIT Lincoln Laboratory email: christ@ll.mit.edu

More information

Harmonic Structure Transform for Speaker Recognition

Harmonic Structure Transform for Speaker Recognition Harmonic Structure Transform for Speaker Recognition Kornel Laskowski & Qin Jin Carnegie Mellon University, Pittsburgh PA, USA KTH Speech Music & Hearing, Stockholm, Sweden 29 August, 2011 Laskowski &

More information

Estimation of Relative Operating Characteristics of Text Independent Speaker Verification

Estimation of Relative Operating Characteristics of Text Independent Speaker Verification International Journal of Engineering Science Invention Volume 1 Issue 1 December. 2012 PP.18-23 Estimation of Relative Operating Characteristics of Text Independent Speaker Verification Palivela Hema 1,

More information

REGENERATION OF SPENT ADSORBENTS USING ADVANCED OXIDATION (PREPRINT)

REGENERATION OF SPENT ADSORBENTS USING ADVANCED OXIDATION (PREPRINT) AL/EQ-TP-1993-0307 REGENERATION OF SPENT ADSORBENTS USING ADVANCED OXIDATION (PREPRINT) John T. Mourand, John C. Crittenden, David W. Hand, David L. Perram, Sawang Notthakun Department of Chemical Engineering

More information

AIR FORCE RESEARCH LABORATORY Directed Energy Directorate 3550 Aberdeen Ave SE AIR FORCE MATERIEL COMMAND KIRTLAND AIR FORCE BASE, NM

AIR FORCE RESEARCH LABORATORY Directed Energy Directorate 3550 Aberdeen Ave SE AIR FORCE MATERIEL COMMAND KIRTLAND AIR FORCE BASE, NM AFRL-DE-PS-JA-2007-1004 AFRL-DE-PS-JA-2007-1004 Noise Reduction in support-constrained multi-frame blind-deconvolution restorations as a function of the number of data frames and the support constraint

More information

Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification

Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification Odyssey 2018 The Speaker and Language Recognition Workshop 26-29 June 2018, Les Sables d Olonne, France Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification Md Hafizur Rahman 1, Ivan

More information

Using Deep Belief Networks for Vector-Based Speaker Recognition

Using Deep Belief Networks for Vector-Based Speaker Recognition INTERSPEECH 2014 Using Deep Belief Networks for Vector-Based Speaker Recognition W. M. Campbell MIT Lincoln Laboratory, Lexington, MA, USA wcampbell@ll.mit.edu Abstract Deep belief networks (DBNs) have

More information

Mixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes

Mixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering

More information

Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models

Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models Project # SERDP-MR-1572 Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models Fridon Shubitidze, Sky Research/Dartmouth College Co-Authors: Irma Shamatava, Sky Research, Inc Alex

More information

Volume 6 Water Surface Profiles

Volume 6 Water Surface Profiles A United States Contribution to the International Hydrological Decade HEC-IHD-0600 Hydrologic Engineering Methods For Water Resources Development Volume 6 Water Surface Profiles July 1975 Approved for

More information

Mixtures of Gaussians with Sparse Structure

Mixtures of Gaussians with Sparse Structure Mixtures of Gaussians with Sparse Structure Costas Boulis 1 Abstract When fitting a mixture of Gaussians to training data there are usually two choices for the type of Gaussians used. Either diagonal or

More information

Optimizing Robotic Team Performance with Probabilistic Model Checking

Optimizing Robotic Team Performance with Probabilistic Model Checking Optimizing Robotic Team Performance with Probabilistic Model Checking Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 Sagar Chaki, Joseph Giampapa, David Kyle (presenting),

More information

Analysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization

Analysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization Analysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization Andreas Nautsch, Rahim Saeidi, Christian Rathgeb, and Christoph

More information

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-088 Public reportin burden for this collection of information is estimated to averae hour per response, includin the time for reviewin instructions,

More information

Shallow Water Fluctuations and Communications

Shallow Water Fluctuations and Communications Shallow Water Fluctuations and Communications H.C. Song Marine Physical Laboratory Scripps Institution of oceanography La Jolla, CA 92093-0238 phone: (858) 534-0954 fax: (858) 534-7641 email: hcsong@mpl.ucsd.edu

More information

Stat260: Bayesian Modeling and Inference Lecture Date: March 10, 2010

Stat260: Bayesian Modeling and Inference Lecture Date: March 10, 2010 Stat60: Bayesian Modelin and Inference Lecture Date: March 10, 010 Bayes Factors, -priors, and Model Selection for Reression Lecturer: Michael I. Jordan Scribe: Tamara Broderick The readin for this lecture

More information

Towards Duration Invariance of i-vector-based Adaptive Score Normalization

Towards Duration Invariance of i-vector-based Adaptive Score Normalization Odyssey 204: The Speaker and Language Recognition Workshop 6-9 June 204, Joensuu, Finland Towards Duration Invariance of i-vector-based Adaptive Score Normalization Andreas Nautsch*, Christian Rathgeb,

More information

Diagonal Representation of Certain Matrices

Diagonal Representation of Certain Matrices Diagonal Representation of Certain Matrices Mark Tygert Research Report YALEU/DCS/RR-33 December 2, 2004 Abstract An explicit expression is provided for the characteristic polynomial of a matrix M of the

More information

ISCA Archive

ISCA Archive ISCA Archive http://www.isca-speech.org/archive ODYSSEY04 - The Speaker and Language Recognition Workshop Toledo, Spain May 3 - June 3, 2004 Analysis of Multitarget Detection for Speaker and Language Recognition*

More information

A Generative Model for Score Normalization in Speaker Recognition

A Generative Model for Score Normalization in Speaker Recognition INTERSPEECH 017 August 0 4, 017, Stockholm, Sweden A Generative Model for Score Normalization in Speaker Recognition Albert Swart and Niko Brümmer Nuance Communications, Inc. (South Africa) albert.swart@nuance.com,

More information

MTIE AND TDEV ANALYSIS OF UNEVENLY SPACED TIME SERIES DATA AND ITS APPLICATION TO TELECOMMUNICATIONS SYNCHRONIZATION MEASUREMENTS

MTIE AND TDEV ANALYSIS OF UNEVENLY SPACED TIME SERIES DATA AND ITS APPLICATION TO TELECOMMUNICATIONS SYNCHRONIZATION MEASUREMENTS MTIE AND TDEV ANALYSIS OF UNEVENLY SPACED TIME SERIES DATA AND ITS APPLICATION TO TELECOMMUNICATIONS SYNCHRONIZATION MEASUREMENTS Mingfu Li, Hsin-Min Peng, and Chia-Shu Liao Telecommunication Labs., Chunghwa

More information

Attribution Concepts for Sub-meter Resolution Ground Physics Models

Attribution Concepts for Sub-meter Resolution Ground Physics Models Attribution Concepts for Sub-meter Resolution Ground Physics Models 76 th MORS Symposium US Coast Guard Academy Approved for public release distribution. 2 Report Documentation Page Form Approved OMB No.

More information

NUMERICAL SOLUTIONS FOR OPTIMAL CONTROL PROBLEMS UNDER SPDE CONSTRAINTS

NUMERICAL SOLUTIONS FOR OPTIMAL CONTROL PROBLEMS UNDER SPDE CONSTRAINTS NUMERICAL SOLUTIONS FOR OPTIMAL CONTROL PROBLEMS UNDER SPDE CONSTRAINTS AFOSR grant number: FA9550-06-1-0234 Yanzhao Cao Department of Mathematics Florida A & M University Abstract The primary source of

More information

Around the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre

Around the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre Around the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre The 2 Parts HDM based diarization System The homogeneity measure 2 Outline

More information

System Reliability Simulation and Optimization by Component Reliability Allocation

System Reliability Simulation and Optimization by Component Reliability Allocation System Reliability Simulation and Optimization by Component Reliability Allocation Zissimos P. Mourelatos Professor and Head Mechanical Engineering Department Oakland University Rochester MI 48309 Report

More information

STATISTICAL MACHINE LEARNING FOR STRUCTURED AND HIGH DIMENSIONAL DATA

STATISTICAL MACHINE LEARNING FOR STRUCTURED AND HIGH DIMENSIONAL DATA AFRL-OSR-VA-TR-2014-0234 STATISTICAL MACHINE LEARNING FOR STRUCTURED AND HIGH DIMENSIONAL DATA Larry Wasserman CARNEGIE MELLON UNIVERSITY 0 Final Report DISTRIBUTION A: Distribution approved for public

More information

10log(1/MSE) log(1/MSE)

10log(1/MSE) log(1/MSE) IROVED MATRI PENCIL METHODS Biao Lu, Don Wei 2, Brian L Evans, and Alan C Bovik Dept of Electrical and Computer Enineerin The University of Texas at Austin, Austin, T 7872-84 fblu,bevans,bovik@eceutexasedu

More information

IBM Research Report. Training Universal Background Models for Speaker Recognition

IBM Research Report. Training Universal Background Models for Speaker Recognition RC24953 (W1003-002) March 1, 2010 Other IBM Research Report Training Universal Bacground Models for Speaer Recognition Mohamed Kamal Omar, Jason Pelecanos IBM Research Division Thomas J. Watson Research

More information

Report Documentation Page

Report Documentation Page Inhibition of blood cholinesterase activity is a poor predictor of acetylcholinesterase inhibition in brain regions of guinea pigs exposed to repeated doses of low levels of soman. Sally M. Anderson Report

More information

Development and Application of Acoustic Metamaterials with Locally Resonant Microstructures

Development and Application of Acoustic Metamaterials with Locally Resonant Microstructures Development and Application of Acoustic Metamaterials with Locally Resonant Microstructures AFOSR grant #FA9550-10-1-0061 Program manager: Dr. Les Lee PI: C.T. Sun Purdue University West Lafayette, Indiana

More information

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

Adjustment of Sampling Locations in Rail-Geometry Datasets: Using Dynamic Programming and Nonlinear Filtering

Adjustment of Sampling Locations in Rail-Geometry Datasets: Using Dynamic Programming and Nonlinear Filtering Systems and Computers in Japan, Vol. 37, No. 1, 2006 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J87-D-II, No. 6, June 2004, pp. 1199 1207 Adjustment of Samplin Locations in Rail-Geometry

More information

Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification

Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification Qiongqiong Wang, Takafumi Koshinaka Data Science Research

More information

Low-dimensional speech representation based on Factor Analysis and its applications!

Low-dimensional speech representation based on Factor Analysis and its applications! Low-dimensional speech representation based on Factor Analysis and its applications! Najim Dehak and Stephen Shum! Spoken Language System Group! MIT Computer Science and Artificial Intelligence Laboratory!

More information

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca

REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research rinceton, NJ scott.rickard,radu.balan,justinian.rosca @scr.siemens.com ABSTRACT

More information

n j u = (3) b u Then we select m j u as a cross product between n j u and û j to create an orthonormal basis: m j u = n j u û j (4)

n j u = (3) b u Then we select m j u as a cross product between n j u and û j to create an orthonormal basis: m j u = n j u û j (4) 4 A Position error covariance for sface feate points For each sface feate point j, we first compute the normal û j by usin 9 of the neihborin points to fit a plane In order to create a 3D error ellipsoid

More information

Comparative Analysis of Flood Routing Methods

Comparative Analysis of Flood Routing Methods US Army Corps of Engineers Hydrologic Engineering Center Comparative Analysis of Flood Routing Methods September 1980 Approved for Public Release. Distribution Unlimited. RD-24 REPORT DOCUMENTATION PAGE

More information

Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data

Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data Juan Torres 1, Ashraf Saad 2, Elliot Moore 1 1 School of Electrical and Computer

More information

Studies on Model Distance Normalization Approach in Text-independent Speaker Verification

Studies on Model Distance Normalization Approach in Text-independent Speaker Verification Vol. 35, No. 5 ACTA AUTOMATICA SINICA May, 009 Studies on Model Distance Normalization Approach in Text-independent Speaker Verification DONG Yuan LU Liang ZHAO Xian-Yu ZHAO Jian Abstract Model distance

More information

Bayesian Analysis of Speaker Diarization with Eigenvoice Priors

Bayesian Analysis of Speaker Diarization with Eigenvoice Priors Bayesian Analysis of Speaker Diarization with Eigenvoice Priors Patrick Kenny Centre de recherche informatique de Montréal Patrick.Kenny@crim.ca A year in the lab can save you a day in the library. Panu

More information

Wire antenna model of the vertical grounding electrode

Wire antenna model of the vertical grounding electrode Boundary Elements and Other Mesh Reduction Methods XXXV 13 Wire antenna model of the vertical roundin electrode D. Poljak & S. Sesnic University of Split, FESB, Split, Croatia Abstract A straiht wire antenna

More information

Closed-form and Numerical Reverberation and Propagation: Inclusion of Convergence Effects

Closed-form and Numerical Reverberation and Propagation: Inclusion of Convergence Effects DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Closed-form and Numerical Reverberation and Propagation: Inclusion of Convergence Effects Chris Harrison Centre for Marine

More information

i-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU

i-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU i-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU 2013-11-18 Framework 1. GMM-UBM Feature is extracted by frame. Number of features are unfixed. Gaussian Mixtures are used to fit all the features. The mixtures

More information

MODEL SELECTION CRITERIA FOR ACOUSTIC SEGMENTATION

MODEL SELECTION CRITERIA FOR ACOUSTIC SEGMENTATION = = = MODEL SELECTION CRITERIA FOR ACOUSTIC SEGMENTATION Mauro Cettolo and Marcello Federico ITC-irst - Centro per la Ricerca Scientifica e Tecnoloica I-385 Povo, Trento, Italy ABSTRACT Robust acoustic

More information

A Generative Model Based Kernel for SVM Classification in Multimedia Applications

A Generative Model Based Kernel for SVM Classification in Multimedia Applications Appears in Neural Information Processing Systems, Vancouver, Canada, 2003. A Generative Model Based Kernel for SVM Classification in Multimedia Applications Pedro J. Moreno Purdy P. Ho Hewlett-Packard

More information

Automatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion

Automatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion INTERSPEECH 203 Automatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion Ville Hautamäki, Kong Aik Lee 2, David van Leeuwen 3, Rahim Saeidi 3, Anthony Larcher 2, Tomi Kinnunen, Taufiq

More information

Theory and Practice of Data Assimilation in Ocean Modeling

Theory and Practice of Data Assimilation in Ocean Modeling Theory and Practice of Data Assimilation in Ocean Modeling Robert N. Miller College of Oceanic and Atmospheric Sciences Oregon State University Oceanography Admin. Bldg. 104 Corvallis, OR 97331-5503 Phone:

More information

P. Kestener and A. Arneodo. Laboratoire de Physique Ecole Normale Supérieure de Lyon 46, allée d Italie Lyon cedex 07, FRANCE

P. Kestener and A. Arneodo. Laboratoire de Physique Ecole Normale Supérieure de Lyon 46, allée d Italie Lyon cedex 07, FRANCE A wavelet-based generalization of the multifractal formalism from scalar to vector valued d- dimensional random fields : from theoretical concepts to experimental applications P. Kestener and A. Arneodo

More information

Predictive Model for Archaeological Resources. Marine Corps Base Quantico, Virginia John Haynes Jesse Bellavance

Predictive Model for Archaeological Resources. Marine Corps Base Quantico, Virginia John Haynes Jesse Bellavance Predictive Model for Archaeological Resources Marine Corps Base Quantico, Virginia John Haynes Jesse Bellavance Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the

More information

Robust Speaker Identification

Robust Speaker Identification Robust Speaker Identification by Smarajit Bose Interdisciplinary Statistical Research Unit Indian Statistical Institute, Kolkata Joint work with Amita Pal and Ayanendranath Basu Overview } } } } } } }

More information

Mixture Distributions for Modeling Lead Time Demand in Coordinated Supply Chains. Barry Cobb. Alan Johnson

Mixture Distributions for Modeling Lead Time Demand in Coordinated Supply Chains. Barry Cobb. Alan Johnson Mixture Distributions for Modeling Lead Time Demand in Coordinated Supply Chains Barry Cobb Virginia Military Institute Alan Johnson Air Force Institute of Technology AFCEA Acquisition Research Symposium

More information

Fleet Maintenance Simulation With Insufficient Data

Fleet Maintenance Simulation With Insufficient Data Fleet Maintenance Simulation With Insufficient Data Zissimos P. Mourelatos Mechanical Engineering Department Oakland University mourelat@oakland.edu Ground Robotics Reliability Center (GRRC) Seminar 17

More information

SW06 Shallow Water Acoustics Experiment Data Analysis

SW06 Shallow Water Acoustics Experiment Data Analysis DISTRIBUTION STATEMENT A: Approved for public release; distribution is unlimited. SW06 Shallow Water Acoustics Experiment Data Analysis James F. Lynch MS #12, Woods Hole Oceanographic Institution Woods

More information

Feature-space Speaker Adaptation for Probabilistic Linear Discriminant Analysis Acoustic Models

Feature-space Speaker Adaptation for Probabilistic Linear Discriminant Analysis Acoustic Models Feature-space Speaker Adaptation for Probabilistic Linear Discriminant Analysis Acoustic Models Liang Lu, Steve Renals Centre for Speech Technology Research, University of Edinburgh, Edinburgh, UK {liang.lu,

More information

Automatic Speech Recognition (CS753)

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short

More information

Environmental Sound Classification in Realistic Situations

Environmental Sound Classification in Realistic Situations Environmental Sound Classification in Realistic Situations K. Haddad, W. Song Brüel & Kjær Sound and Vibration Measurement A/S, Skodsborgvej 307, 2850 Nærum, Denmark. X. Valero La Salle, Universistat Ramon

More information

Monaural speech separation using source-adapted models

Monaural speech separation using source-adapted models Monaural speech separation using source-adapted models Ron Weiss, Dan Ellis {ronw,dpwe}@ee.columbia.edu LabROSA Department of Electrical Enginering Columbia University 007 IEEE Workshop on Applications

More information

Robust Sound Event Detection in Continuous Audio Environments

Robust Sound Event Detection in Continuous Audio Environments Robust Sound Event Detection in Continuous Audio Environments Haomin Zhang 1, Ian McLoughlin 2,1, Yan Song 1 1 National Engineering Laboratory of Speech and Language Information Processing The University

More information

Wavelet Spectral Finite Elements for Wave Propagation in Composite Plates

Wavelet Spectral Finite Elements for Wave Propagation in Composite Plates Wavelet Spectral Finite Elements for Wave Propagation in Composite Plates Award no: AOARD-0904022 Submitted to Dr Kumar JATA Program Manager, Thermal Sciences Directorate of Aerospace, Chemistry and Materials

More information

On Applying Point-Interval Logic to Criminal Forensics

On Applying Point-Interval Logic to Criminal Forensics On Applying Point-Interval Logic to Criminal Forensics (Student Paper) Mashhood Ishaque Abbas K. Zaidi Alexander H. Levis Command and Control Research and Technology Symposium 1 Report Documentation Page

More information

REPORT DOCUMENTATION PAGE

REPORT DOCUMENTATION PAGE REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,

More information

CRS Report for Congress

CRS Report for Congress CRS Report for Congress Received through the CRS Web Order Code RS21396 Updated May 26, 2006 Summary Iraq: Map Sources Hannah Fischer Information Research Specialist Knowledge Services Group This report

More information

Analysis of South China Sea Shelf and Basin Acoustic Transmission Data

Analysis of South China Sea Shelf and Basin Acoustic Transmission Data DISTRIBUTION STATEMENT A: Approved for public release; distribution is unlimited. Analysis of South China Sea Shelf and Basin Acoustic Transmission Data Ching-Sang Chiu Department of Oceanography Naval

More information

Thermo-Kinetic Model of Burning for Polymeric Materials

Thermo-Kinetic Model of Burning for Polymeric Materials Thermo-Kinetic Model of Burning for Polymeric Materials Stanislav I. Stoliarov a, Sean Crowley b, Richard Lyon b a University of Maryland, Fire Protection Engineering, College Park, MD 20742 b FAA W. J.

More information

Extension of the BLT Equation to Incorporate Electromagnetic Field Propagation

Extension of the BLT Equation to Incorporate Electromagnetic Field Propagation Extension of the BLT Equation to Incorporate Electromagnetic Field Propagation Fredrick M. Tesche Chalmers M. Butler Holcombe Department of Electrical and Computer Engineering 336 Fluor Daniel EIB Clemson

More information

Efficient method for obtaining parameters of stable pulse in grating compensated dispersion-managed communication systems

Efficient method for obtaining parameters of stable pulse in grating compensated dispersion-managed communication systems 3 Conference on Information Sciences and Systems, The Johns Hopkins University, March 12 14, 3 Efficient method for obtainin parameters of stable pulse in ratin compensated dispersion-manaed communication

More information

Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks

Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Ying Lin and Biao Chen Dept. of EECS Syracuse University Syracuse, NY 13244, U.S.A. ylin20 {bichen}@ecs.syr.edu Peter

More information

VLBA IMAGING OF SOURCES AT 24 AND 43 GHZ

VLBA IMAGING OF SOURCES AT 24 AND 43 GHZ VLBA IMAGING OF SOURCES AT 24 AND 43 GHZ D.A. BOBOLTZ 1, A.L. FEY 1, P. CHARLOT 2,3 & THE K-Q VLBI SURVEY COLLABORATION 1 U.S. Naval Observatory 3450 Massachusetts Ave., NW, Washington, DC, 20392-5420,

More information

Nonlinear Model Reduction of Differential Algebraic Equation (DAE) Systems

Nonlinear Model Reduction of Differential Algebraic Equation (DAE) Systems Nonlinear Model Reduction of Differential Alebraic Equation DAE Systems Chuili Sun and Jueren Hahn Department of Chemical Enineerin eas A&M University Collee Station X 77843-3 hahn@tamu.edu repared for

More information

Scattering of Internal Gravity Waves at Finite Topography

Scattering of Internal Gravity Waves at Finite Topography Scattering of Internal Gravity Waves at Finite Topography Peter Muller University of Hawaii Department of Oceanography 1000 Pope Road, MSB 429 Honolulu, HI 96822 phone: (808)956-8081 fax: (808)956-9164

More information

INFRARED SPECTROSCOPY OF HYDROGEN CYANIDE IN SOLID PARAHYDROGEN (BRIEFING CHARTS)

INFRARED SPECTROSCOPY OF HYDROGEN CYANIDE IN SOLID PARAHYDROGEN (BRIEFING CHARTS) AFRL-MN-EG-TP-2006-7403 INFRARED SPECTROSCOPY OF HYDROGEN CYANIDE IN SOLID PARAHYDROGEN (BRIEFING CHARTS) C. Michael Lindsay, National Research Council, Post Doctoral Research Associate Mario E. Fajardo

More information

Parametric Models of NIR Transmission and Reflectivity Spectra for Dyed Fabrics

Parametric Models of NIR Transmission and Reflectivity Spectra for Dyed Fabrics Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5708--15-9629 Parametric Models of NIR Transmission and Reflectivity Spectra for Dyed Fabrics D. Aiken S. Ramsey T. Mayo Signature Technology

More information

GMM-based classification from noisy features

GMM-based classification from noisy features GMM-based classification from noisy features Alexey Ozerov, Mathieu Lagrange and Emmanuel Vincent INRIA, Centre de Rennes - Bretagne Atlantique STMS Lab IRCAM - CNRS - UPMC alexey.ozerov@inria.fr, mathieu.lagrange@ircam.fr,

More information