REPORT DOCUMENTATION PAGE
|
|
- Richard May
- 5 years ago
- Views:
Transcription
1 REPORT DOCMENTATION PAGE Form Approved OMB No Public reportin burden for this collection of information is estimated to averae hour per response, includin the time for reviewin instructions, searchin data sources, atherin and maintainin the data needed, and completin and reviewin the collection of information. Send comments reardin this burden estimate or any other aspect of this collection of information, includin suestions for reducin this burden to Washinton Headquarters Service, Directorate for Information Operations and Reports, 5 Jefferson Davis Hihway, Suite 4, Arlinton, VA -43, and to the Office of Manaement and Budet, Paperwork Reduction Project (74-88) Washinton, DC 53. PLEASE DO NOT RETRN YOR FORM TO THE ABOVE ADDRESS.. REPORT DATE (DD-MM-YYYY) MARCH 4. TITLE AND SBTITLE. REPORT TYPE Conference Paper (PREPRINT) INTEGRATED FEATRE NORMALIZATION AND ENHANCEMENT FOR ROBST SPEAKER RECOGNITION SING ACOSTIC FACTOR ANALYSIS (PREPRINT) 3. DATES COVERED (From - To) 5a. CONTRACT NMBER FA875-9-C-67 5b. GRANT NMBER 5c. PROGRAM ELEMENT NMBER 35885G 6. ATHOR(S) 5d. PROJECT NMBER 388 Taufiq Hasan, and John H. L. Hansen 5e. TASK NMBER BA 5f. WORK NIT NMBER AE 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION Research Associates for Subcontractor: niversity of Texas at Dallas REPORT NMBER Defense Conversion, Inc. 8 W Campbell Rd Hillside Terrace Richardson, TX 758 Marcy NY SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) Air Force Research Laboratory/Information Directorate Rome Research Site/RIGC 55 Brooks Road Rome NY SPONSOR/MONITOR'S ACRONYM(S). SPONSORING/MONITORING AGENCY REPORT NMBER AFRL-RI-RS-TP--4. DISTRIBTION AVAILABILITY STATEMENT Approved for public release; distribution unlimited. PA# 88ABW--89 Cleared Date: 9 March 3. SPPLEMENTARY NOTES This paper was accepted for publication in the Proceedins of the Interspeech Conference, Portland, Oreon, 9-3 Sept-. This work was funded in whole or in part by Department of the Air Force contract number FA875-9-C- 67. The.S. Government has for itself and others actin on its behalf an unlimited, paid-up, nonexclusive, irrevocable worldwide license to use, modify, reproduce, release, perform, display, or disclose the work by or on behalf of the Government. All other rihts are reserved by the copyriht owner. 4. ABSTRACT State-of-the-art factor analysis based channel compensation methods for speaker reconition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture of dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis (AFA) transformation performs feature dimensionality reduction, decorrelation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener ain on the acoustic feature eienvector directions, and is similar to the sinal sub-space based speech enhancement schemes. We also propose several methods of adaptively selectin the AFA parameter for each mixture. The proposed feature transformation is applied usin a probabilistic mixture alinment, and is interated with a conventional i-vector system. Experimental results on the telephone trials of the NIST SRE demonstrate the effectiveness of the proposed scheme. 5. SBJECT TERMS Channel Normalization, Factor Analysis, Tactical SIGINT Technoloy, Gaussian Modelin 6. SECRITY CLASSIFICATION OF: 7. LIMITATION OF ABSTRACT a. REPORT b. ABSTRACT c. THIS PAGE 8. NMBER OF PAGES 5 9a. NAME OF RESPONSIBLE PERSON JOHN G. PARKER 9b. TELEPHONE NMBER (Include area code) Standard Form 98 (Rev. 8-98) Prescribed by ANSI Std. Z39.8
2 Interated Feature Normalization and Enhancement for robust Speaker Reconition usin Acoustic Factor Analysis Taufiq Hasan and John H. L. Hansen Center for Robust Speech Systems (CRSS), Eric Jonsson School of Enineerin, niversity of Texas at Dallas, Richardson, Texas,.S.A. {Taufiq.Hasan, Abstract State-of-the-art factor analysis based channel compensation methods for speaker reconition are based on the assumption that speaker/utterance dependent Gaussian Mixture Model (GMM) mean super-vectors can be constrained to lie in a lower dimensional subspace, which does not consider the fact that conventional acoustic features may also be constrained in a similar way in the feature space. In this study, motivated by the low-rank covariance structure of cepstral features, we propose a factor analysis model in the acoustic feature space instead of the super-vector domain and derive a mixture dependent feature transformation. We demonstrate that, the proposed Acoustic Factor Analysis (AFA) transformation performs feature dimensionality reduction, de-correlation, variance normalization and enhancement at the same time. The transform applies a square-root Wiener ain on the acoustic feature eienvector directions, and is similar to the sinal sub-space based speech enhancement schemes. We also propose several methods of adaptively selectin the AFA parameter for each mixture. The proposed feature transform is applied usin a probabilistic mixture alinment, and is interated with a conventional i-vector system. Experimental results on the telephone trials of the NIST SRE demonstrate the effectiveness of the proposed scheme.. Introduction Factor analysis based channel compensation methods for speaker reconition are based on the assumption that, speaker/utterance dependent adapted GMM [] mean super-vectors can be constrained to lie in a lower dimensional subspace [ 4]. Lower dimensional speaker and channel dependent subspaces assumption inspired various channel compensation schemes such as, Eienvoice [], Eienchannel [3] and Joint Factor Analysis (JFA) [3]. With the introduction of i-vectors, which are the latent factors of the so called total variability space [4], research trend shifted towards directly applyin compensation techniques on these lower dimensional utterance level features, enablin the development of fully Bayesian techniques [5, 6]. While super-vector domain factor analysis techniques and its derivatives are effective, they do not consider the fact that acoustic feature vectors can also be constrained in a lower dimensional subspace of the feature space. This is clear, since a low-rank modelin of the super-covariance matrix obtained from different utterance GMMs is not equivalent to a low-rank assumption of the acoustic feature covariance matrix. Lower dimensional representation of speech short-time spectrum is a well established phenomenon which motivated a family of speech enhancement methods known as the sinal subspace approach [7]. This phenomenon This project was funded by AFRL under contract FA (Approved for public release, distribution unlimited: 88ABW-- 8), and partially by the niversity of Texas at Dallas from the Distinuished niversity Chair in Telecommunications Enineerin held by J.H.L. Hansen. is also found to be valid for popular acoustic features, such as Melfrequency Cepstral Coefficients (MFCC) [8], even thouh these features are processed by Discrete Cosine Transform (DCT) for de-correlation. To illustrate that acoustic feature covariance matrices have close to zero eienvalues, and can be assumed low-rank, we train a 4 mixture full-covariance GMM niversal Backround Model (BM) usin 6 dimensional MFCC features on a lare development data set. For a typical mixture of this BM, the covariance matrix is plotted as an intensity imae in Fi. (a), revealin that the non-diaonal components are indeed sinificant. Fi. (b) shows the sorted eienvalues of three different mixtures showin how the enery is compacted in the first few coefficients, while the later ones are close to zero. Also, it is known that the first few eien-directions of the feature covariance matrix are relatively more speaker dependent [8], further justifyin the low-rank assumption of features for a speaker reconition task. Inspired by the abovementioned observations, in this study, we propose an acoustic factor analysis [9] scheme for speaker reconition and develop a mixture-dependent feature dimensionality reduction transform. The proposed transformation performs dimensionality reduction, de-correlation, feature variance normalization, and enhancement, at once. Also, instead of hard feature alinment to a specific mixture, applyin the transformation, and then retrainin the BM, we use a probabilistic frame alinment and transform the BM parameters within the system. Interatin the proposed method within a standard i-vector system provides sinificant ain in system performance.. Acoustic Factor Analysis In this section, we describe the proposed factor analysis model of acoustic features, discuss its formulation, mixture-wise application for dimensionality reduction, advantaes and properties... Formulation Let X = {x n n = N} be the collection of all acoustic feature vectors from the development set. sin a factor analysis model, a d feature vector x X can be represented by, x = Wy + µ + c. () Here, W is a d q low rank factor loadin matrix that represents q < d bases spannin the subspace with important variability in the feature space, and µ is the d mean vector of x. We denote the latent variable vector y N (, I) as acoustic factors, which is of dimension q. We assume the remainin noise component c N (, σ I) to be isotropic and the model is thus equivalent to Probabilistic Principal Component Analysis (PPCA) []. The advantae of this model is that the acoustic factors y, explains the correlation between the feature coefficients x, which we believe are more speaker dependent [8], while the noise component c incorporates the residual variance of the data. It should be em- Details on feature extraction and BM data are iven in Sec. 4..
3 6.3 4 Typical mixture (Enery compaction 68.3%) Low compact mixture (Enery compaction 8.67%) Eienvalue 8 6 Hih compact mixture (Enery compaction 53.3%) Occurance probability Static Sorted Index Top posterior probability, max [p( x n )] (a) (b) (c) Fiure : (a) An intensity imae plot showin a typical covariance matrix of a full-covariance BM (darker indicate lower values) trained on 6-dimensional MFCC features ( static+ + ). (b) Sorted eienvalues of three covariance matrices from the BM showin different enery compaction in various mixtures. Enery compaction is the percentae of top eienvectors that account for 95% of total enery. A typical, low and hih compact mixture eienvalues are shown. (c) Distribution of top BM mixture posterior probability values for development features revealin that frame alinment is not always definitive. phasized that even thouh we denote the term c as noise, when x represent cepstral features c actually represent convolutional channel distortion. sin a mixture of these models [], we have, p(x) = w p(x ) where () p(x ) = N (µ, σ I + W W T ). (3) Here, µ, w, W and σ denote the mean vector, mixture weiht, factor loadins, and noise variance for the -th AFA mixture... Mixture dependent transformation One advantae of usin a mixture of PPCA models for acoustic factor analysis is that its parameters can be conveniently extracted from a full-covariance GMM-BM trained usin the Expectation- Maximization (EM) alorithm []. The procedure is iven below:... niversal Backround Model The BM model Λ, trained on the dataset X is iven by, p(x Λ ) = M w N (µ, Σ ) (4) = where w are the mixture weihts, M is the total number of mixtures, µ are the mean vectors and Σ are the full covariance matrices. Here, µ and w are the same as in () and (3).... Noise subspace selection The AFA parameter q in () defines the number of principal axes to retain, assumin the lower (d q) directions spans the noiseonly subspace [7]. The maximum likelihood estimate of the noise variance σ for the -th mixture is iven by, σ = d q d i=q+ λ,i (5) where λ,q+ λ,d are the d q smallest eienvalues of Σ. We note that q can be different for each mixture, and thus in the later sections, we denote it by q()...3. Compute the factor loadin matrix The maximum likelihood estimation of the factor loadin matrix W of the -th mixture of the AFA model in () is iven by [], W = q (Λ q σ I) / R (6) where q is a d q matrix whose columns are the q leadin eienvectors of Σ, Λ q is a diaonal matrix containin the correspondin q eienvalues, and R is a q q arbitrary orthoonal rotation matrix. In this work, we set R = I...4. The AFA transformation The posterior mean of the acoustic factors y n can be used as the transformed and dimensionality reduced version of x n for the -th component of the AFA model. This can be shown to be []: where E{y n x n, } = y n x n, = A T (x n µ i) z n, (7) A = W M T and (8) M = σ I + W T W. (9) The matrix A is termed the -th AFA transform. Here, the oriinal feature vectors x n are replaced by the mixture dependent transformed vectors z n,. It can be easily shown that z n, is normally distributed with a zero mean and diaonal covariance matrix Σ z = I σλ q, demonstratin that A performs mean normalization and de-correlation. Conventionally, a feature vector x n is alined with a mixture that yields the hihest posterior probability p( x n, Λ ), and the correspondin transformation is applied for dimensionality reduction []. However, as we observe the distribution of max p( x n, Λ ) for our development data in Fi. (c), features are alined with multiple Gaussians in most cases providin values of max p( x n, Λ ).5. Thus, we propose to apply the AFA transform usin a probabilistic alinment and transform the BM instead of retrainin it. The new BM is iven by, M p(z ˆΛ ) = w N (, Σ z ) () i=.3. Feature enhancement in AFA Expansion of the transformation matrix A in (8) unveils the builtin enhancement operation it performs. Substitutin the expressions of W and M from (6) and (9) in (8), we have: A T = Λ q (Λq σ I) T / T q T where we utilized the fact that, q diaonal ain matrix iven by: = Λq q G q T () = I and introduced a G = Λ q (Λ q σ I) T /. ()
4 GAIN Square-root Wiener Gain ( ξ ξ+ ) / SNR, ξ [db] Wiener Gain ξ ξ+ Fiure : Input SNR [db] (ξ) vs. Wiener ains. Wiener and square-root Wiener ains are shown with a solid (-) and dashed (- -) line, respectively. 3. AFA interated i-vector system In this section, we describe how the proposed method could be incorporated into the current state-of-the-art i-vector system framework [4]. First, a full covariance BM model, Λ iven by (4), is trained on the development data vectors. Next, the AFA dimension q() for each mixture is set, and the noise variance σ is computed usin (5). The factor loadin matrix W and transformation matrix A are then computed usin (6) and (8), respectively. For each development utterance s, the zero order statistics is iven by, N s() = n s γ (n) where γ (n) = p( x n, Λ ) (5) Keepin aside the term Λ q in (), we observe that the operation in (7) first computes the inner product of the mean normalized acoustic feature with the q principal eienvectors of Σ, then for each i-th eienvector direction, applies the ain function specified by the i-th diaonal of G in (), which is actually a square-root Weiner ain function []. Definin the classic speech enhancement terminoloy a priori SNR ξ as [7] ξ = (λ,i σ)/σ, and usin this to express the ain equations, we plot the ain functions aainst ξ in Fi.. Thus, in addition to de-correlation and dimensionality reduction, the transform A also performs feature enhancement, assumin a noise variance σ. It maybe noted that conventional factor analysis techniques in super-vector space are also known to be representable usin similar Wiener like ain functions as discussed in []..4. Feature variance normalization in AFA The term Λ q in () normalizes the variance of the acoustic feature stream in the i-th eien-direction, since λ,i is the expected feature variance alon this direction [3]. The AFA transformation performs this normalization in each mixture in addition to the enhancement mentioned in the previous section. This process is interestinly similar to the cepstral variance normalization frequently performed in the front-end except for its domain of operation..5. Adaptive AFA dimension selection Since the distribution of eienvalues are different in each mixture, a unique value of q() should be suitable in each case. This is illustrated in Fi. (b), where eienvalues of three different mixture covariance matrices are shown. Typically we see an enery compaction of 7%, that is, the first 7% eienvalues account for 95% of the total enery. But in other cases, the enery compaction can be very low or hih, demandin that the AFA retained dimension to be low or hih, respectively. Motivated by this observation, we develop a simple method of selectin the AFA dimension, q(). We first set the enery compaction ratio E close to.9.97, then compute the sorted eienvalues λ i, of Σ for each mixture. The optimal dimension retainin E% enery is then calculated as: q E() = min q q i= λ,i s.t. d > E. (3) i= λ,i This method is denoted by AFA-Var-En. As an alternate method, we use the effective rank estimation alorithm in [4], enerally used for noise estimation in matrices, to select the AFA dimension. A threshold δ [, ] is set and the AFA dimension is obtained by: ( ) q / i= q δ () = min s.t. λ,i q d > δ. (4) i= λ,i We denote this method as AFA-Var-Rk. When the AFA dimension fixed for all mixtures, we denote the system by AFA-Fix. followin the standard procedure [,4]. However, for AFA, the first order statistics ˆF s() is extracted usin the transformed features in the correspondin mixtures instead of the oriinal features. γ (n)z n, = A T γ (n)(x n µ ) (6) ˆF s() = n s n s For estimatin the total variability (TV) matrix, the standard procedure is followed [4] usin the new BM ˆΛ iven in (), and statistics ˆF s() and N s(). It should be noted that, in this case the super-vector dimension reduces to K = M = q() from Md, and TV matrix size becomes K R. We define super-vector compression ratio α = K/Md, measurin overall AFA compaction. 4. Experiments and Results We perform our experiments on the male trials of NIST SRE telephone train/test condition (condition 5, normal vocal effort). 4.. System Description For voice activity detection, a phoneme reconizer [5] combined with an enery based scheme is used. 6-dimension feature vectors (9 MFCC +Enery + + ) are extracted, usin a 5 ms window with ms shift and Gaussianized usin a 3-s slidin window. Gender dependent full and diaonal covariance BMs with 4 mixtures are trained on utterances selected from Switchboard II Phase and 3, Switchboard Cellular Part and, and the NIST 4, 5, 6 SRE enrollment data. For the TV matrix trainin, the same dataset is utilized. 4 dimensional i-vectors are extracted, whitened and then lenth normalized [5]. For session variability compensation and scorin we use a Gaussian Probabilistic Linear Discriminant Analysis (PLDA) scheme with a fullcovariance noise model [5]. 4.. Results usin AFA transformation We compare several AFA systems with our diaonal and fullcovariance BM based i-vector systems, denoted by baseline dia-cov and baseline full-cov, respectively. The followin AFA systems are built: AFA-Fix with q() = 4, 48 and 54, AFA-Var-En with E =.9,.95 and.97, and AFA-Var-Rk with δ =.99,.995 and.997. In this experiment, we fix the PLDA eienvoice size N EV to. The results are shown in Table. From the results, we observe that the AFA-Var-En system in eneral outperforms the basic AFA-Fix, even with similar compression ratio α. The best results are obtained for AFA-Var-En (E =.97) with an EER of.76% obtainin a 5.37% relative improvement from Baseline full-cov. The AFA-Var-Rk (δ =.997) method provided the best DCF old of These results prove that the proposed AFA transformation is successfully able to reduce some nuisance directions in the feature space producin i-vectors with better speaker discriminatin ability. We also observe that AFA i-vector systems are computationally less expensive compared to the baseline system, rouhly by a factor of α. 3
5 Table : Comparison between baseline i-vector and AFA systems with respect to %EER, DCF old and DCF new for N EV =. Percent relative improvement (%r) and super-vector compression ratio (α) are also shown. System α %EER/%r DCF old/%r DCF new/%r Baseline full-cov q = /.4.3/.57.44/3.46 AFA-Fix q = 48.8./.5./ /. q = /5.96./ /3.9 E = /-.56.3/ /6.7 AFA-Var-En E = /../.96.4/.55 E = /5.37./9.4.4/.8 δ = /-.3.3/.7.4/8.35 AFA-Var-Rk δ = /6.9./5.6.4/7.39 δ = /.95./.56.38/6.49 Table : Linear score fusion of baseline and AFA systems Individual system performances System N EV %EER DCF old DCF new (i) Baseline full-cov (ii) Baseline dia-cov (iii) AFA-Var-En (.97) (iv) AFA-Fix (48) Fusion system performances Fusion of (i) & (iii) Fusion of (i) & (iv) Fusion of (i) - (iii) This is expected since the computational complexity of an i-vector system is proportional to the super-vector size [6] Fusion of multiple systems We pick four of our systems for fusion: (i) Baseline full-cov, (ii) Baseline dia-cov, (iii) AFA-Var-En (E =.97), and (iv) AFA- Fix (q = 48). The PLDA N EV parameter was set to for systems (i)-(iii) and 5 for system (iv). Simple linear fusion was used with mean and variance normalization of scores to (, ) for calibration. From the results presented in Table, fusion performance of (i) and (iii) clearly reveals that AFA and baseline fullcov systems have complementary information, as the EER and DCF values improve. The best result is achieved by fusin systems (i), (ii) and (iii), to obtain: EER =.87%, DCF old =.993 and DCF new = Performance comparison of the systems (i), (ii) and the fusion is shown in Fi. 3 usin Detection Error Trade-off (DET) curves. Here, aain we observe the superiority of the proposed AFA-Var-En system compared to the baseline especially in the low false alarm reion, whereas the fusion system shows better performance in the full DET curve rane. 5. Conclusions In this study, we have proposed a factor analysis model for acoustic features to compensate for transmission channel mismatch in speaker reconition. sin the model, we have developed a mixture dependent feature transform that performs dimensionality reduction, de-correlation, variance normalization and enhancement, at once. Instead of a separate front-end processin, the proposed transform has been interated within an i-vector speaker reconition framework usin a probabilistic feature alinment technique. Experimental results have demonstrated the superiority of the proposed scheme compared to the baseline i-vector system. 6. References [] D. Reynolds, T. Quatieri, and R. Dunn, Speaker verification usin adapted Gaussian mixture models, Diital Sinal Process., vol., no. 3, pp. 9 4,. [] P. Kenny, G. Boulianne, and P. Dumouchel, Eienvoice modelin with sparse trainin data, IEEE Trans. Speech and Audio Process., vol. 3, no. 3, pp , May 5. Miss probability (%) AFA-Var-En (.97) Baseline Full-cov Fusion False Alarm probability (%) Fiure 3: DET curves showin individual subsystems: AFA-Var-En (E =.97), Baseline full-cov, and Fusion of (i)-(iii) shown in Table. [3] P. Kenny, G. Boulianne, P. Ouellet, and P. Dumouchel, Joint factor analysis versus Eienchannels in speaker reconition, IEEE Trans. Audio, Speech, and Lan. Process., vol. 5, no. 4, pp , May 7. [4] N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, Frontend factor analysis for speaker verification, IEEE Trans. Audio, Speech, and Lan. Process., vol. 9, no. 99, pp , May. [5] D. Garcia-Romero and C. Y. Espy-Wilson, Analysis of i-vector lenth normalization in speaker reconition systems, in Proc. Interspeech, Florence, Italy, Oct., pp [6] P. Matejka, O. Glembek, F. Castaldo, M. Alam, O. Plchot, P. Kenny, L. Buret, and J. Cernocky, Full-covariance BM and heavy-tailed PLDA in i-vector speaker verification, in Proc. ICASSP, Florence, Italy, Oct., pp [7] Y. Ephraim and H. Van Trees, A sinal subspace approach for speech enhancement, IEEE Trans. Speech and Audio Process., vol. 3, no. 4, pp. 5 66, Jul 995. [8] B. Zhou and J. H. L. Hansen, Rapid discriminative acoustic model based on Eienspace mappin for fast speaker adaptation, IEEE Trans. Speech and Audio Process., vol. 3, no. 4, pp , July 5. [9] T. Hasan and J. H. L. Hansen, Factor analysis of acoustic features usin a mixture of probabilistic principal component analyzers for robust speaker verification, in Proc. Odyssey, Sinapore, June. [] M. Tippin and C. Bishop, Mixtures of probabilistic principal component analyzers, Neural computation, vol., no., pp , 999. [] S. Vasehi, Advanced sinal processin and diital noise reduction. Wiley, 996. [] A. McCree, D. Sturim, and D. Reynolds, A new perspective on GMM subspace compensation based on PPCA and wiener filterin, in Proc. Interspeech, Florence, Italy, Oct., pp [3] T. Anderson, Asymptotic theory for principal component analysis, The Annals of Mathematical Statistics, vol. 34, no., pp. 48, 963. [4] J. Cadzow, SVD representation of unitarily invariant matrices, IEEE Trans. Acoustics, Speech and Sinal Process., vol. 3, no. 3, pp. 5 56, Jun 984. [5] P. Schwarz, P. Matejka, and J. Cernocky, Hierarchical structures of neural networks for phoneme reconition, in Proc. ICASSP, vol., May 6. [6] O. Glembek, L. Buret, P. Matejka, M. Karafiat, and P. Kenny, Simplification and optimization of i-vector extraction, in Proc. ICASSP, Florence, Italy, Oct., pp
Support Vector Machines using GMM Supervectors for Speaker Verification
1 Support Vector Machines using GMM Supervectors for Speaker Verification W. M. Campbell, D. E. Sturim, D. A. Reynolds MIT Lincoln Laboratory 244 Wood Street Lexington, MA 02420 Corresponding author e-mail:
More informationFront-End Factor Analysis For Speaker Verification
IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING Front-End Factor Analysis For Speaker Verification Najim Dehak, Patrick Kenny, Réda Dehak, Pierre Dumouchel, and Pierre Ouellet, Abstract This
More informationUsually the estimation of the partition function is intractable and it becomes exponentially hard when the complexity of the model increases. However,
Odyssey 2012 The Speaker and Language Recognition Workshop 25-28 June 2012, Singapore First attempt of Boltzmann Machines for Speaker Verification Mohammed Senoussaoui 1,2, Najim Dehak 3, Patrick Kenny
More informationImproving the Effectiveness of Speaker Verification Domain Adaptation With Inadequate In-Domain Data
Distribution A: Public Release Improving the Effectiveness of Speaker Verification Domain Adaptation With Inadequate In-Domain Data Bengt J. Borgström Elliot Singer Douglas Reynolds and Omid Sadjadi 2
More informationMulticlass Discriminative Training of i-vector Language Recognition
Odyssey 214: The Speaker and Language Recognition Workshop 16-19 June 214, Joensuu, Finland Multiclass Discriminative Training of i-vector Language Recognition Alan McCree Human Language Technology Center
More informationModified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System
INERSPEECH 2015 Modified-prior PLDA and Score Calibration for Duration Mismatch Compensation in Speaker Recognition System QingYang Hong 1, Lin Li 1, Ming Li 2, Ling Huang 1, Lihong Wan 1, Jun Zhang 1
More informationSpeaker recognition by means of Deep Belief Networks
Speaker recognition by means of Deep Belief Networks Vasileios Vasilakakis, Sandro Cumani, Pietro Laface, Politecnico di Torino, Italy {first.lastname}@polito.it 1. Abstract Most state of the art speaker
More informationSpeaker Verification Using Accumulative Vectors with Support Vector Machines
Speaker Verification Using Accumulative Vectors with Support Vector Machines Manuel Aguado Martínez, Gabriel Hernández-Sierra, and José Ramón Calvo de Lara Advanced Technologies Application Center, Havana,
More informationNovel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors
Published in Ninth International Conference on Advances in Pattern Recognition (ICAPR-2017), Bangalore, India Novel Quality Metric for Duration Variability Compensation in Speaker Verification using i-vectors
More informationA Small Footprint i-vector Extractor
A Small Footprint i-vector Extractor Patrick Kenny Odyssey Speaker and Language Recognition Workshop June 25, 2012 1 / 25 Patrick Kenny A Small Footprint i-vector Extractor Outline Introduction Review
More informationSCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS
SCORE CALIBRATING FOR SPEAKER RECOGNITION BASED ON SUPPORT VECTOR MACHINES AND GAUSSIAN MIXTURE MODELS Marcel Katz, Martin Schafföner, Sven E. Krüger, Andreas Wendemuth IESK-Cognitive Systems University
More informationAn Integration of Random Subspace Sampling and Fishervoice for Speaker Verification
Odyssey 2014: The Speaker and Language Recognition Workshop 16-19 June 2014, Joensuu, Finland An Integration of Random Subspace Sampling and Fishervoice for Speaker Verification Jinghua Zhong 1, Weiwu
More informationGain Compensation for Fast I-Vector Extraction over Short Duration
INTERSPEECH 27 August 2 24, 27, Stockholm, Sweden Gain Compensation for Fast I-Vector Extraction over Short Duration Kong Aik Lee and Haizhou Li 2 Institute for Infocomm Research I 2 R), A STAR, Singapore
More informationSession Variability Compensation in Automatic Speaker Recognition
Session Variability Compensation in Automatic Speaker Recognition Javier González Domínguez VII Jornadas MAVIR Universidad Autónoma de Madrid November 2012 Outline 1. The Inter-session Variability Problem
More informationUnifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication
Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication Aleksandr Sizov 1, Kong Aik Lee, Tomi Kinnunen 1 1 School of Computing, University of Eastern Finland, Finland Institute
More informationMaximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems
Maximum Likelihood and Maximum A Posteriori Adaptation for Distributed Speaker Recognition Systems Chin-Hung Sit 1, Man-Wai Mak 1, and Sun-Yuan Kung 2 1 Center for Multimedia Signal Processing Dept. of
More informationAn Invariance Property of the Generalized Likelihood Ratio Test
352 IEEE SIGNAL PROCESSING LETTERS, VOL. 10, NO. 12, DECEMBER 2003 An Invariance Property of the Generalized Likelihood Ratio Test Steven M. Kay, Fellow, IEEE, and Joseph R. Gabriel, Member, IEEE Abstract
More informationHow to Deal with Multiple-Targets in Speaker Identification Systems?
How to Deal with Multiple-Targets in Speaker Identification Systems? Yaniv Zigel and Moshe Wasserblat ICE Systems Ltd., Audio Analysis Group, P.O.B. 690 Ra anana 4307, Israel yanivz@nice.com Abstract In
More informationMinimax i-vector extractor for short duration speaker verification
Minimax i-vector extractor for short duration speaker verification Ville Hautamäki 1,2, You-Chi Cheng 2, Padmanabhan Rajan 1, Chin-Hui Lee 2 1 School of Computing, University of Eastern Finl, Finl 2 ECE,
More informationJoint Factor Analysis for Speaker Verification
Joint Factor Analysis for Speaker Verification Mengke HU ASPITRG Group, ECE Department Drexel University mengke.hu@gmail.com October 12, 2012 1/37 Outline 1 Speaker Verification Baseline System Session
More informationSPEECH recognition systems based on hidden Markov
IEEE SIGNAL PROCESSING LETTERS, VOL. X, NO. X, 2014 1 Probabilistic Linear Discriminant Analysis for Acoustic Modelling Liang Lu, Member, IEEE and Steve Renals, Fellow, IEEE Abstract In this letter, we
More informationEFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION. Taufiq Hasan Al Banna
EFFECTIVE ACOUSTIC MODELING FOR ROBUST SPEAKER RECOGNITION by Taufiq Hasan Al Banna APPROVED BY SUPERVISORY COMMITTEE: Dr. John H. L. Hansen, Chair Dr. Carlos Busso Dr. Hlaing Minn Dr. P. K. Rajasekaran
More informationUse of Wijsman's Theorem for the Ratio of Maximal Invariant Densities in Signal Detection Applications
Use of Wijsman's Theorem for the Ratio of Maximal Invariant Densities in Signal Detection Applications Joseph R. Gabriel Naval Undersea Warfare Center Newport, Rl 02841 Steven M. Kay University of Rhode
More informationUncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition
Uncertainty Modeling without Subspace Methods for Text-Dependent Speaker Recognition Patrick Kenny, Themos Stafylakis, Md. Jahangir Alam and Marcel Kockmann Odyssey Speaker and Language Recognition Workshop
More information7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBER
REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationReport Documentation Page
Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationA Bound on Mean-Square Estimation Error Accounting for System Model Mismatch
A Bound on Mean-Square Estimation Error Accounting for System Model Mismatch Wen Xu RD Instruments phone: 858-689-8682 email: wxu@rdinstruments.com Christ D. Richmond MIT Lincoln Laboratory email: christ@ll.mit.edu
More informationHarmonic Structure Transform for Speaker Recognition
Harmonic Structure Transform for Speaker Recognition Kornel Laskowski & Qin Jin Carnegie Mellon University, Pittsburgh PA, USA KTH Speech Music & Hearing, Stockholm, Sweden 29 August, 2011 Laskowski &
More informationEstimation of Relative Operating Characteristics of Text Independent Speaker Verification
International Journal of Engineering Science Invention Volume 1 Issue 1 December. 2012 PP.18-23 Estimation of Relative Operating Characteristics of Text Independent Speaker Verification Palivela Hema 1,
More informationREGENERATION OF SPENT ADSORBENTS USING ADVANCED OXIDATION (PREPRINT)
AL/EQ-TP-1993-0307 REGENERATION OF SPENT ADSORBENTS USING ADVANCED OXIDATION (PREPRINT) John T. Mourand, John C. Crittenden, David W. Hand, David L. Perram, Sawang Notthakun Department of Chemical Engineering
More informationAIR FORCE RESEARCH LABORATORY Directed Energy Directorate 3550 Aberdeen Ave SE AIR FORCE MATERIEL COMMAND KIRTLAND AIR FORCE BASE, NM
AFRL-DE-PS-JA-2007-1004 AFRL-DE-PS-JA-2007-1004 Noise Reduction in support-constrained multi-frame blind-deconvolution restorations as a function of the number of data frames and the support constraint
More informationDomain-invariant I-vector Feature Extraction for PLDA Speaker Verification
Odyssey 2018 The Speaker and Language Recognition Workshop 26-29 June 2018, Les Sables d Olonne, France Domain-invariant I-vector Feature Extraction for PLDA Speaker Verification Md Hafizur Rahman 1, Ivan
More informationUsing Deep Belief Networks for Vector-Based Speaker Recognition
INTERSPEECH 2014 Using Deep Belief Networks for Vector-Based Speaker Recognition W. M. Campbell MIT Lincoln Laboratory, Lexington, MA, USA wcampbell@ll.mit.edu Abstract Deep belief networks (DBNs) have
More informationMixtures of Gaussians with Sparse Regression Matrices. Constantinos Boulis, Jeffrey Bilmes
Mixtures of Gaussians with Sparse Regression Matrices Constantinos Boulis, Jeffrey Bilmes {boulis,bilmes}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UW Electrical Engineering
More informationCamp Butner UXO Data Inversion and Classification Using Advanced EMI Models
Project # SERDP-MR-1572 Camp Butner UXO Data Inversion and Classification Using Advanced EMI Models Fridon Shubitidze, Sky Research/Dartmouth College Co-Authors: Irma Shamatava, Sky Research, Inc Alex
More informationVolume 6 Water Surface Profiles
A United States Contribution to the International Hydrological Decade HEC-IHD-0600 Hydrologic Engineering Methods For Water Resources Development Volume 6 Water Surface Profiles July 1975 Approved for
More informationMixtures of Gaussians with Sparse Structure
Mixtures of Gaussians with Sparse Structure Costas Boulis 1 Abstract When fitting a mixture of Gaussians to training data there are usually two choices for the type of Gaussians used. Either diagonal or
More informationOptimizing Robotic Team Performance with Probabilistic Model Checking
Optimizing Robotic Team Performance with Probabilistic Model Checking Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 Sagar Chaki, Joseph Giampapa, David Kyle (presenting),
More informationAnalysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization
Analysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization Andreas Nautsch, Rahim Saeidi, Christian Rathgeb, and Christoph
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-088 Public reportin burden for this collection of information is estimated to averae hour per response, includin the time for reviewin instructions,
More informationShallow Water Fluctuations and Communications
Shallow Water Fluctuations and Communications H.C. Song Marine Physical Laboratory Scripps Institution of oceanography La Jolla, CA 92093-0238 phone: (858) 534-0954 fax: (858) 534-7641 email: hcsong@mpl.ucsd.edu
More informationStat260: Bayesian Modeling and Inference Lecture Date: March 10, 2010
Stat60: Bayesian Modelin and Inference Lecture Date: March 10, 010 Bayes Factors, -priors, and Model Selection for Reression Lecturer: Michael I. Jordan Scribe: Tamara Broderick The readin for this lecture
More informationTowards Duration Invariance of i-vector-based Adaptive Score Normalization
Odyssey 204: The Speaker and Language Recognition Workshop 6-9 June 204, Joensuu, Finland Towards Duration Invariance of i-vector-based Adaptive Score Normalization Andreas Nautsch*, Christian Rathgeb,
More informationDiagonal Representation of Certain Matrices
Diagonal Representation of Certain Matrices Mark Tygert Research Report YALEU/DCS/RR-33 December 2, 2004 Abstract An explicit expression is provided for the characteristic polynomial of a matrix M of the
More informationISCA Archive
ISCA Archive http://www.isca-speech.org/archive ODYSSEY04 - The Speaker and Language Recognition Workshop Toledo, Spain May 3 - June 3, 2004 Analysis of Multitarget Detection for Speaker and Language Recognition*
More informationA Generative Model for Score Normalization in Speaker Recognition
INTERSPEECH 017 August 0 4, 017, Stockholm, Sweden A Generative Model for Score Normalization in Speaker Recognition Albert Swart and Niko Brümmer Nuance Communications, Inc. (South Africa) albert.swart@nuance.com,
More informationMTIE AND TDEV ANALYSIS OF UNEVENLY SPACED TIME SERIES DATA AND ITS APPLICATION TO TELECOMMUNICATIONS SYNCHRONIZATION MEASUREMENTS
MTIE AND TDEV ANALYSIS OF UNEVENLY SPACED TIME SERIES DATA AND ITS APPLICATION TO TELECOMMUNICATIONS SYNCHRONIZATION MEASUREMENTS Mingfu Li, Hsin-Min Peng, and Chia-Shu Liao Telecommunication Labs., Chunghwa
More informationAttribution Concepts for Sub-meter Resolution Ground Physics Models
Attribution Concepts for Sub-meter Resolution Ground Physics Models 76 th MORS Symposium US Coast Guard Academy Approved for public release distribution. 2 Report Documentation Page Form Approved OMB No.
More informationNUMERICAL SOLUTIONS FOR OPTIMAL CONTROL PROBLEMS UNDER SPDE CONSTRAINTS
NUMERICAL SOLUTIONS FOR OPTIMAL CONTROL PROBLEMS UNDER SPDE CONSTRAINTS AFOSR grant number: FA9550-06-1-0234 Yanzhao Cao Department of Mathematics Florida A & M University Abstract The primary source of
More informationAround the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre
Around the Speaker De-Identification (Speaker diarization for de-identification ++) Itshak Lapidot Moez Ajili Jean-Francois Bonastre The 2 Parts HDM based diarization System The homogeneity measure 2 Outline
More informationSystem Reliability Simulation and Optimization by Component Reliability Allocation
System Reliability Simulation and Optimization by Component Reliability Allocation Zissimos P. Mourelatos Professor and Head Mechanical Engineering Department Oakland University Rochester MI 48309 Report
More informationSTATISTICAL MACHINE LEARNING FOR STRUCTURED AND HIGH DIMENSIONAL DATA
AFRL-OSR-VA-TR-2014-0234 STATISTICAL MACHINE LEARNING FOR STRUCTURED AND HIGH DIMENSIONAL DATA Larry Wasserman CARNEGIE MELLON UNIVERSITY 0 Final Report DISTRIBUTION A: Distribution approved for public
More information10log(1/MSE) log(1/MSE)
IROVED MATRI PENCIL METHODS Biao Lu, Don Wei 2, Brian L Evans, and Alan C Bovik Dept of Electrical and Computer Enineerin The University of Texas at Austin, Austin, T 7872-84 fblu,bevans,bovik@eceutexasedu
More informationIBM Research Report. Training Universal Background Models for Speaker Recognition
RC24953 (W1003-002) March 1, 2010 Other IBM Research Report Training Universal Bacground Models for Speaer Recognition Mohamed Kamal Omar, Jason Pelecanos IBM Research Division Thomas J. Watson Research
More informationReport Documentation Page
Inhibition of blood cholinesterase activity is a poor predictor of acetylcholinesterase inhibition in brain regions of guinea pigs exposed to repeated doses of low levels of soman. Sally M. Anderson Report
More informationDevelopment and Application of Acoustic Metamaterials with Locally Resonant Microstructures
Development and Application of Acoustic Metamaterials with Locally Resonant Microstructures AFOSR grant #FA9550-10-1-0061 Program manager: Dr. Les Lee PI: C.T. Sun Purdue University West Lafayette, Indiana
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 The public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationAdjustment of Sampling Locations in Rail-Geometry Datasets: Using Dynamic Programming and Nonlinear Filtering
Systems and Computers in Japan, Vol. 37, No. 1, 2006 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J87-D-II, No. 6, June 2004, pp. 1199 1207 Adjustment of Samplin Locations in Rail-Geometry
More informationUnsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification
INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Unsupervised Discriminative Training of PLDA for Domain Adaptation in Speaker Verification Qiongqiong Wang, Takafumi Koshinaka Data Science Research
More informationLow-dimensional speech representation based on Factor Analysis and its applications!
Low-dimensional speech representation based on Factor Analysis and its applications! Najim Dehak and Stephen Shum! Spoken Language System Group! MIT Computer Science and Artificial Intelligence Laboratory!
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationREAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca
REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research rinceton, NJ scott.rickard,radu.balan,justinian.rosca @scr.siemens.com ABSTRACT
More informationn j u = (3) b u Then we select m j u as a cross product between n j u and û j to create an orthonormal basis: m j u = n j u û j (4)
4 A Position error covariance for sface feate points For each sface feate point j, we first compute the normal û j by usin 9 of the neihborin points to fit a plane In order to create a 3D error ellipsoid
More informationComparative Analysis of Flood Routing Methods
US Army Corps of Engineers Hydrologic Engineering Center Comparative Analysis of Flood Routing Methods September 1980 Approved for Public Release. Distribution Unlimited. RD-24 REPORT DOCUMENTATION PAGE
More informationApplication of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data
Application of a GA/Bayesian Filter-Wrapper Feature Selection Method to Classification of Clinical Depression from Speech Data Juan Torres 1, Ashraf Saad 2, Elliot Moore 1 1 School of Electrical and Computer
More informationStudies on Model Distance Normalization Approach in Text-independent Speaker Verification
Vol. 35, No. 5 ACTA AUTOMATICA SINICA May, 009 Studies on Model Distance Normalization Approach in Text-independent Speaker Verification DONG Yuan LU Liang ZHAO Xian-Yu ZHAO Jian Abstract Model distance
More informationBayesian Analysis of Speaker Diarization with Eigenvoice Priors
Bayesian Analysis of Speaker Diarization with Eigenvoice Priors Patrick Kenny Centre de recherche informatique de Montréal Patrick.Kenny@crim.ca A year in the lab can save you a day in the library. Panu
More informationWire antenna model of the vertical grounding electrode
Boundary Elements and Other Mesh Reduction Methods XXXV 13 Wire antenna model of the vertical roundin electrode D. Poljak & S. Sesnic University of Split, FESB, Split, Croatia Abstract A straiht wire antenna
More informationClosed-form and Numerical Reverberation and Propagation: Inclusion of Convergence Effects
DISTRIBUTION STATEMENT A. Approved for public release; distribution is unlimited. Closed-form and Numerical Reverberation and Propagation: Inclusion of Convergence Effects Chris Harrison Centre for Marine
More informationi-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU
i-vector and GMM-UBM Bie Fanhu CSLT, RIIT, THU 2013-11-18 Framework 1. GMM-UBM Feature is extracted by frame. Number of features are unfixed. Gaussian Mixtures are used to fit all the features. The mixtures
More informationMODEL SELECTION CRITERIA FOR ACOUSTIC SEGMENTATION
= = = MODEL SELECTION CRITERIA FOR ACOUSTIC SEGMENTATION Mauro Cettolo and Marcello Federico ITC-irst - Centro per la Ricerca Scientifica e Tecnoloica I-385 Povo, Trento, Italy ABSTRACT Robust acoustic
More informationA Generative Model Based Kernel for SVM Classification in Multimedia Applications
Appears in Neural Information Processing Systems, Vancouver, Canada, 2003. A Generative Model Based Kernel for SVM Classification in Multimedia Applications Pedro J. Moreno Purdy P. Ho Hewlett-Packard
More informationAutomatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion
INTERSPEECH 203 Automatic Regularization of Cross-entropy Cost for Speaker Recognition Fusion Ville Hautamäki, Kong Aik Lee 2, David van Leeuwen 3, Rahim Saeidi 3, Anthony Larcher 2, Tomi Kinnunen, Taufiq
More informationTheory and Practice of Data Assimilation in Ocean Modeling
Theory and Practice of Data Assimilation in Ocean Modeling Robert N. Miller College of Oceanic and Atmospheric Sciences Oregon State University Oceanography Admin. Bldg. 104 Corvallis, OR 97331-5503 Phone:
More informationP. Kestener and A. Arneodo. Laboratoire de Physique Ecole Normale Supérieure de Lyon 46, allée d Italie Lyon cedex 07, FRANCE
A wavelet-based generalization of the multifractal formalism from scalar to vector valued d- dimensional random fields : from theoretical concepts to experimental applications P. Kestener and A. Arneodo
More informationPredictive Model for Archaeological Resources. Marine Corps Base Quantico, Virginia John Haynes Jesse Bellavance
Predictive Model for Archaeological Resources Marine Corps Base Quantico, Virginia John Haynes Jesse Bellavance Report Documentation Page Form Approved OMB No. 0704-0188 Public reporting burden for the
More informationRobust Speaker Identification
Robust Speaker Identification by Smarajit Bose Interdisciplinary Statistical Research Unit Indian Statistical Institute, Kolkata Joint work with Amita Pal and Ayanendranath Basu Overview } } } } } } }
More informationMixture Distributions for Modeling Lead Time Demand in Coordinated Supply Chains. Barry Cobb. Alan Johnson
Mixture Distributions for Modeling Lead Time Demand in Coordinated Supply Chains Barry Cobb Virginia Military Institute Alan Johnson Air Force Institute of Technology AFCEA Acquisition Research Symposium
More informationFleet Maintenance Simulation With Insufficient Data
Fleet Maintenance Simulation With Insufficient Data Zissimos P. Mourelatos Mechanical Engineering Department Oakland University mourelat@oakland.edu Ground Robotics Reliability Center (GRRC) Seminar 17
More informationSW06 Shallow Water Acoustics Experiment Data Analysis
DISTRIBUTION STATEMENT A: Approved for public release; distribution is unlimited. SW06 Shallow Water Acoustics Experiment Data Analysis James F. Lynch MS #12, Woods Hole Oceanographic Institution Woods
More informationFeature-space Speaker Adaptation for Probabilistic Linear Discriminant Analysis Acoustic Models
Feature-space Speaker Adaptation for Probabilistic Linear Discriminant Analysis Acoustic Models Liang Lu, Steve Renals Centre for Speech Technology Research, University of Edinburgh, Edinburgh, UK {liang.lu,
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short
More informationEnvironmental Sound Classification in Realistic Situations
Environmental Sound Classification in Realistic Situations K. Haddad, W. Song Brüel & Kjær Sound and Vibration Measurement A/S, Skodsborgvej 307, 2850 Nærum, Denmark. X. Valero La Salle, Universistat Ramon
More informationMonaural speech separation using source-adapted models
Monaural speech separation using source-adapted models Ron Weiss, Dan Ellis {ronw,dpwe}@ee.columbia.edu LabROSA Department of Electrical Enginering Columbia University 007 IEEE Workshop on Applications
More informationRobust Sound Event Detection in Continuous Audio Environments
Robust Sound Event Detection in Continuous Audio Environments Haomin Zhang 1, Ian McLoughlin 2,1, Yan Song 1 1 National Engineering Laboratory of Speech and Language Information Processing The University
More informationWavelet Spectral Finite Elements for Wave Propagation in Composite Plates
Wavelet Spectral Finite Elements for Wave Propagation in Composite Plates Award no: AOARD-0904022 Submitted to Dr Kumar JATA Program Manager, Thermal Sciences Directorate of Aerospace, Chemistry and Materials
More informationOn Applying Point-Interval Logic to Criminal Forensics
On Applying Point-Interval Logic to Criminal Forensics (Student Paper) Mashhood Ishaque Abbas K. Zaidi Alexander H. Levis Command and Control Research and Technology Symposium 1 Report Documentation Page
More informationREPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
More informationCRS Report for Congress
CRS Report for Congress Received through the CRS Web Order Code RS21396 Updated May 26, 2006 Summary Iraq: Map Sources Hannah Fischer Information Research Specialist Knowledge Services Group This report
More informationAnalysis of South China Sea Shelf and Basin Acoustic Transmission Data
DISTRIBUTION STATEMENT A: Approved for public release; distribution is unlimited. Analysis of South China Sea Shelf and Basin Acoustic Transmission Data Ching-Sang Chiu Department of Oceanography Naval
More informationThermo-Kinetic Model of Burning for Polymeric Materials
Thermo-Kinetic Model of Burning for Polymeric Materials Stanislav I. Stoliarov a, Sean Crowley b, Richard Lyon b a University of Maryland, Fire Protection Engineering, College Park, MD 20742 b FAA W. J.
More informationExtension of the BLT Equation to Incorporate Electromagnetic Field Propagation
Extension of the BLT Equation to Incorporate Electromagnetic Field Propagation Fredrick M. Tesche Chalmers M. Butler Holcombe Department of Electrical and Computer Engineering 336 Fluor Daniel EIB Clemson
More informationEfficient method for obtaining parameters of stable pulse in grating compensated dispersion-managed communication systems
3 Conference on Information Sciences and Systems, The Johns Hopkins University, March 12 14, 3 Efficient method for obtainin parameters of stable pulse in ratin compensated dispersion-manaed communication
More informationDistributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks
Distributed Binary Quantizers for Communication Constrained Large-scale Sensor Networks Ying Lin and Biao Chen Dept. of EECS Syracuse University Syracuse, NY 13244, U.S.A. ylin20 {bichen}@ecs.syr.edu Peter
More informationVLBA IMAGING OF SOURCES AT 24 AND 43 GHZ
VLBA IMAGING OF SOURCES AT 24 AND 43 GHZ D.A. BOBOLTZ 1, A.L. FEY 1, P. CHARLOT 2,3 & THE K-Q VLBI SURVEY COLLABORATION 1 U.S. Naval Observatory 3450 Massachusetts Ave., NW, Washington, DC, 20392-5420,
More informationNonlinear Model Reduction of Differential Algebraic Equation (DAE) Systems
Nonlinear Model Reduction of Differential Alebraic Equation DAE Systems Chuili Sun and Jueren Hahn Department of Chemical Enineerin eas A&M University Collee Station X 77843-3 hahn@tamu.edu repared for
More informationScattering of Internal Gravity Waves at Finite Topography
Scattering of Internal Gravity Waves at Finite Topography Peter Muller University of Hawaii Department of Oceanography 1000 Pope Road, MSB 429 Honolulu, HI 96822 phone: (808)956-8081 fax: (808)956-9164
More informationINFRARED SPECTROSCOPY OF HYDROGEN CYANIDE IN SOLID PARAHYDROGEN (BRIEFING CHARTS)
AFRL-MN-EG-TP-2006-7403 INFRARED SPECTROSCOPY OF HYDROGEN CYANIDE IN SOLID PARAHYDROGEN (BRIEFING CHARTS) C. Michael Lindsay, National Research Council, Post Doctoral Research Associate Mario E. Fajardo
More informationParametric Models of NIR Transmission and Reflectivity Spectra for Dyed Fabrics
Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5708--15-9629 Parametric Models of NIR Transmission and Reflectivity Spectra for Dyed Fabrics D. Aiken S. Ramsey T. Mayo Signature Technology
More informationGMM-based classification from noisy features
GMM-based classification from noisy features Alexey Ozerov, Mathieu Lagrange and Emmanuel Vincent INRIA, Centre de Rennes - Bretagne Atlantique STMS Lab IRCAM - CNRS - UPMC alexey.ozerov@inria.fr, mathieu.lagrange@ircam.fr,
More information