A Priori SNR Estimation Using Weibull Mixture Model
|
|
- Edwina Lawson
- 6 years ago
- Views:
Transcription
1 A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev, Jens Heitkaemper, Reinhold Haeb-Umbach Department of Communications Engineering Paderborn University 7. Oktober 2016 Computer Science, Electrical Engineering and Mathematics Communications Engineering Prof. Dr.-Ing. Reinhold Häb-Umbach
2 Table of contents 1 Problem formulation and motivation 2 A priori SNR estimation based on Weibull mixture model 3 Experimental evaluation 4 Conclusions and outlook A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10
3 Problem formulation and motivation Single-channel clean speech s(t) contaminated by an additive noise n(t): y(t) = s(t)+n(t) STFT - Y(k, l) = S(k, l)+n(k, l) Noise PSD tracker ˆλN(k, l) noise power spectral density (PSD) k - frequency bin l - frame index Y(k, l) 2 Y(k, l) 2 A priori SNR estimator ˆξ(k, l) Gain G(k, l) Ŝ(k, l) ŝ(t) ISTFT function A priori SNR ξ(k, l) = λ S(k,l) λ N a key component in enhancement system (k,l) λ S (k, l) = E [ S(k, l) 2] - clean speech PSD, λ N (k, l) = E [ N(k, l) 2] - noise PSD Motivated by a generalized spectral subtraction (GSS) denoising Y(k, l) α for α R >0 not restricted to (α = 1) or (α = 2) with assumption Y(k, l) α = S(k, l) α + N(k, l) α A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10
4 Table of contents 1 Problem formulation and motivation 2 A priori SNR estimation based on Weibull mixture model 3 Experimental evaluation 4 Conclusions and outlook A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10
5 Normalized α-order magnitude (NAOM) domain A priori SNR estimator Y(k, l) 2 ˆλN(k, l) Estimate PSα(k) and go into NAOM domain Yα(k, l) λnα(k, l) Estimate parameter of WMM psα(s) λm(k, l) πm(k, l) Estimate clean speech NAOMs Ŝα(k, l) Calculate ˆξ(k, l) a priori SNR Normalize Y(k, l) α to a root of an averaged power P Sα (k) of S(k, l) α Y α(k, l) = Y(k, l) α = Sα(k, l)+nα(k, l) with PSα (k) P Statistical models independent of speaker loudness S α (k) = 1 L Normalized energy of clean speech NAOMs E[S 2 α (k)] = 1 L S(k, l) 2α S α(k, l) & N α(k, l) realizations of random variables S α(k) & N α(k) l=1 Estimate S α(k, l) from Y α(k, l) given models for S α(k)& N α(k) A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 2 / 10
6 Modeling of noise NAOM coefficients N α (k, l) N(k, l) N c(n; 0, λ N (k, l)) N α(k, l) Weibull distributed p Nα(k,l)(n) = Weib(n; λ Nα (k, l), α) Shape parameter α R >0 Scale parameter λ Nα (k, l) = λ N(k, l) R α >0 PSα (k) Model N α(k) with Weibull PDF p Nα(k)(n) = Weib(n; λ Nα (k), α) with λ Nα (k) = 1 L λ Nα (k, l) L l=1 NAOM coefficients of white noise signal and estimated p Nα(k)(n) Weib(n; 1, α) p Nα(n) Weibull PDF for λ = 1 and different α n Histogram and Weibull PDF for α = Noise NAOMs Weibull PDF n A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 3 / 10
7 Modeling of NAOM coefficients of clean speech S α (k, l) S(k, l) N c(n; 0, λ S (k, l)) Bimodal Weibull mixture model (WMM) to model S α(k) 2 p Sα(k)(s) = π m(k) Weib(s; λ m(k), β) m=1 10 Histogram and estimated WMM Clean speech NAOMs Bimodal WMM m = 1 component m = 2 component m = 1 : silence m = 2 : activity π m(k) [0, 1]: weights λ m(k): scale parameters β: shape parameter p Sα (s) 1 β α : additional degree of freedom in the model Clean speech NAOMs & estimated WMM (α = 0.7; β = 2.5) s A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 4 / 10
8 Estimation of WMM parameters and clean speech NAOMs A priori SNR estimator Y(k, l) 2 ˆλN(k, l) Estimate PSα(k) and go into NAOM domain Yα(k, l) λnα(k, l) Estimate parameter of WMM psα(s) λm(k, l) πm(k, l) Estimate clean speech NAOMs Ŝα(k, l) Calculate ˆξ(k, l) a priori SNR Set λ 1(k) acc. to ξ min usually used in a priori SNR estimation [Cappe 94] Expectation Maximization algorithm to estimate λ 2(k), π m(k) After EM, weights π m(k) are corrected with the constraint E[S 2 α(k)] = 1 A priori SNR estimator Y(k, l) 2 ˆλN(k, l) Estimate PSα(k) and go into NAOM domain Yα(k, l) λnα(k, l) Estimate parameter of WMM psα(s) λm(k, l) πm(k, l) Estimate clean speech NAOMs Ŝα(k, l) Calculate ˆξ(k, l) a priori SNR Maximum a posteriori (MAP) estimation: Ŝα MAP (k, l) = argmax p Sα(k) Y α(k,l)(s y) s Y α(k, l) is a realisation of random variable Y α(k) = S α(k) + N α(k) Approximative computationally efficient solution for β = α = 1 A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 5 / 10
9 Calculation of a priori SNR and causal implementation A priori SNR estimator Y(k, l) 2 ˆλN(k, l) Estimate PSα(k) and go into NAOM domain Yα(k, l) λnα(k, l) Estimate parameter of WMM psα(s) λm(k, l) πm(k, l) Estimate clean speech NAOMs Ŝα(k, l) Calculate ˆξ(k, l) a priori SNR Go back into domain of power spectral density by calculating [Ŝα(k, l) P ] 2 α Sα(k) ˆξ(k, l) = max, ξ min λ N (k, l) Causal implementation of WMM-based a priori SNR estimators Calculate P Sα (k) and λ Nα (k) in a causal way Causal EM for λ 2(k) and π 2(k) with one EM-iteration per time frame Note, parameters α and β have to be set appropriately optimization A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 6 / 10
10 Table of contents 1 Problem formulation and motivation 2 A priori SNR estimation based on Weibull mixture model 3 Experimental evaluation 4 Conclusions and outlook A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 6 / 10
11 Experimental evaluation Data and setup Clean speech: Wall Street Journal database 16 khz (male and female) 7 different noise types of Noisex92 database: white, pink, f16, hfchannel, factory-1, factory-2, babble Input global SNR from 5 db up to 25 db in 5 db steps Spectral speech enhancement framework Noise PSD tracking using Minimum statistics approach [Martin 01] A priori SNR estimation with ξ min = 18 db [Cappe 94] Proposed WMM-based approach with Wiener filter Reference approach: Decision Directed [Ephraim 84] A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 7 / 10
12 Optimization of α and β Speech quality maximization in terms of wide-band mean opinion score listening quality objective (MOS-LQO) with MOS-LQO = max( MOS-LQO WMM MOS-LQO DD, 0) Averaging over genders, noise types and input global SNR values (α opt, β opt) = (0.64, 2.7) MOS-LQO β α A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 8 / 10
13 Final experimental results Clean speech: WSJ database signals other than used for optimization Estimation error Itakura-Saito distance (ISD) and estimator s variance logarithmic error variance (LEV): the smaller the better Resulting ISD, LEV and MOS-LQO values averaged over noise types SNR, db AVG ISD LEV MOS-LQO DD WMM DD WMM DD WMM A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 9 / 10
14 Conclusions and outlook Conclusions Novel causal a priori SNR estimator based on a bimodal Weibull mixture model for the normalized α-order spectral magnitudes (NAOMs) Optimization of the proposed approach by maximization of speech quality Power exponent α opt = 0.64 smaller than 1 (spectral magnitudes) Shape factor β opt = 2.7 a heavier tailed Weibull distribution Compared to the wide-spread Decision Directed approach: Reduced error and variance of the WMM-based a priori SNR estimator Improvement of speech quality of the enhanced signals Higher computational effort Outlook Reduction of computational effort fixed speaker-independent models Development of model-based spectral enhancement using generalized (arbitrary) power exponent in the spirit of generalized spectral subtraction A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 10 / 10
15 Thank you for your attention! Questions? Paderborn University Department of Communications Engineering Web: nt.upb.de Computer Science, Electrical Engineering and Mathematics Communications Engineering Prof. Dr.-Ing. Reinhold Häb-Umbach
16 Resulting WMM parameter and audio samples log(λ) π k Figure : Resulting WMM parameter over frequency bins λ mean λ mean 1 (k) 2 (k) π mean π mean 1 (k) 2 (k) Exemplarily speech samples: Noisy DD WMM A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 10 / 10
Noise-Presence-Probability-Based Noise PSD Estimation by Using DNNs
Noise-Presence-Probability-Based Noise PSD Estimation by Using DNNs 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev, Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach Department of Communications
More informationImproved noise power spectral density tracking by a MAP-based postprocessor
Improved noise power spectral density tracking by a MAP-based postprocessor Aleksej Chinaev, Alexander Krueger, Dang Hai Tran Vu, Reinhold Haeb-Umbach University of Paderborn, Germany March 8th, 01 Computer
More informationA Priori SNR Estimation Using a Generalized Decision Directed Approach
A Priori SNR Estimation Using a Generalized Decision Directed Approach Aleksej Chinaev, Reinhold Haeb-Umbach Department of Communications Engineering, Paderborn University, 3398 Paderborn, Germany {chinaev,haeb}@nt.uni-paderborn.de
More informationA Priori SNR Estimation Using Weibull Mixture Model. Abstract. 2 MAP estimation of the a priori SNR based on Weibull mixture models.
A Priori SNR Estimation Using Weibull ixture odel Aleksej Chinaev, Jens Heitkaemper, Reinhold Haeb-Umbach Department of Communications Engineering, Paderborn University, 33100 Paderborn, Germany Email:
More informationOptimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator
1 Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator Israel Cohen Lamar Signal Processing Ltd. P.O.Box 573, Yokneam Ilit 20692, Israel E-mail: icohen@lamar.co.il
More information2D Spectrogram Filter for Single Channel Speech Enhancement
Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 007 89 D Spectrogram Filter for Single Channel Speech Enhancement HUIJUN DING,
More informationModifying Voice Activity Detection in Low SNR by correction factors
Modifying Voice Activity Detection in Low SNR by correction factors H. Farsi, M. A. Mozaffarian, H.Rahmani Department of Electrical Engineering University of Birjand P.O. Box: +98-9775-376 IRAN hfarsi@birjand.ac.ir
More informationNew Statistical Model for the Enhancement of Noisy Speech
New Statistical Model for the Enhancement of Noisy Speech Electrical Engineering Department Technion - Israel Institute of Technology February 22, 27 Outline Problem Formulation and Motivation 1 Problem
More informationNOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group
NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION M. Schwab, P. Noll, and T. Sikora Technical University Berlin, Germany Communication System Group Einsteinufer 17, 1557 Berlin (Germany) {schwab noll
More informationDigital Signal Processing
Digital Signal Processing 0 (010) 157 1578 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp Improved minima controlled recursive averaging technique using
More informationNon-Stationary Noise Power Spectral Density Estimation Based on Regional Statistics
Non-Stationary Noise Power Spectral Density Estimation Based on Regional Statistics Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud To cite this version: Xiaofei Li, Laurent Girin, Sharon Gannot,
More informationAdapting Wavenet for Speech Enhancement DARIO RETHAGE JULY 12, 2017
Adapting Wavenet for Speech Enhancement DARIO RETHAGE JULY 12, 2017 I am v Master Student v 6 months @ Music Technology Group, Universitat Pompeu Fabra v Deep learning for acoustic source separation v
More information"Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction"
"Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction" Francesco Nesta, Marco Matassoni {nesta, matassoni}@fbk.eu Fondazione Bruno Kessler-Irst, Trento (ITALY) For contacts:
More informationSource localization and separation for binaural hearing aids
Source localization and separation for binaural hearing aids Mehdi Zohourian, Gerald Enzner, Rainer Martin Listen Workshop, July 218 Institute of Communication Acoustics Outline 1 Introduction 2 Binaural
More informationA POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL
A POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL Balázs Fodor Institute for Communications Technology Technische Universität Braunschweig
More informationAnalysis of audio intercepts: Can we identify and locate the speaker?
Motivation Analysis of audio intercepts: Can we identify and locate the speaker? K V Vijay Girish, PhD Student Research Advisor: Prof A G Ramakrishnan Research Collaborator: Dr T V Ananthapadmanabha Medical
More informationEnhancement of Noisy Speech. State-of-the-Art and Perspectives
Enhancement of Noisy Speech State-of-the-Art and Perspectives Rainer Martin Institute of Communications Technology (IFN) Technical University of Braunschweig July, 2003 Applications of Noise Reduction
More informationNoise Reduction. Two Stage Mel-Warped Weiner Filter Approach
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach Intellectual Property Advanced front-end feature extraction algorithm ETSI ES 202 050 V1.1.3 (2003-11) European Telecommunications Standards
More informationBIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION. Robert Rehr, Timo Gerkmann
BIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION Robert Rehr, Timo Gerkmann Speech Signal Processing Group, Department of Medical Physics and Acoustics
More informationA Second-Order-Statistics-based Solution for Online Multichannel Noise Tracking and Reduction
A Second-Order-Statistics-based Solution for Online Multichannel Noise Tracking and Reduction Mehrez Souden, Jingdong Chen, Jacob Benesty, and Sofiène Affes Abstract We propose a second-order-statistics-based
More informationSINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS
204 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) SINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS Hajar Momeni,2,,
More informationA SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL
A SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL Balázs Fodor Institute for Communications Technology Technische Universität Braunschweig 386 Braunschweig,
More informationSPEECH enhancement algorithms are often used in communication
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 25, NO. 2, FEBRUARY 2017 397 An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation Robert Rehr, Student
More informationSpectral Domain Speech Enhancement using HMM State-Dependent Super-Gaussian Priors
IEEE SIGNAL PROCESSING LETTERS 1 Spectral Domain Speech Enhancement using HMM State-Dependent Super-Gaussian Priors Nasser Mohammadiha, Student Member, IEEE, Rainer Martin, Fellow, IEEE, and Arne Leijon,
More informationMinimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features A Theoretically Consistent Approach
Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features A Theoretically Consistent Approach Jesper Jensen Abstract In this work we consider the problem of feature enhancement for noise-robust
More informationA SUBSPACE METHOD FOR SPEECH ENHANCEMENT IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes
A SUBSPACE METHOD FOR SPEECH ENHANCEMENT IN THE MODULATION DOMAIN Yu ang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London, UK Email: {yw09,
More informationSystem Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain
System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain Electrical Engineering Department Technion - Israel Institute of Technology Supervised by: Prof. Israel Cohen Outline
More informationSupervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization Nasser Mohammadiha*, Student Member, IEEE, Paris Smaragdis,
More informationA Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise
334 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL 11, NO 4, JULY 2003 A Generalized Subspace Approach for Enhancing Speech Corrupted by Colored Noise Yi Hu, Student Member, IEEE, and Philipos C
More informationMANY digital speech communication applications, e.g.,
406 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2007 An MMSE Estimator for Speech Enhancement Under a Combined Stochastic Deterministic Speech Model Richard C.
More informationImproved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR
Improved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR Bengt J. Borgström, Student Member, IEEE, and Abeer Alwan, IEEE Fellow Abstract This paper
More informationA Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement
A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,
More informationModeling speech signals in the time frequency domain using GARCH
Signal Processing () 53 59 Fast communication Modeling speech signals in the time frequency domain using GARCH Israel Cohen Department of Electrical Engineering, Technion Israel Institute of Technology,
More informationAcoustic MIMO Signal Processing
Yiteng Huang Jacob Benesty Jingdong Chen Acoustic MIMO Signal Processing With 71 Figures Ö Springer Contents 1 Introduction 1 1.1 Acoustic MIMO Signal Processing 1 1.2 Organization of the Book 4 Part I
More informationSPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION
SPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION Amin Haji Abolhassani 1, Sid-Ahmed Selouani 2, Douglas O Shaughnessy 1 1 INRS-Energie-Matériaux-Télécommunications,
More informationCEPSTRAL analysis has been widely used in signal processing
162 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 7, NO. 2, MARCH 1999 On Second-Order Statistics and Linear Estimation of Cepstral Coefficients Yariv Ephraim, Fellow, IEEE, and Mazin Rahim, Senior
More informationExemplar-based voice conversion using non-negative spectrogram deconvolution
Exemplar-based voice conversion using non-negative spectrogram deconvolution Zhizheng Wu 1, Tuomas Virtanen 2, Tomi Kinnunen 3, Eng Siong Chng 1, Haizhou Li 1,4 1 Nanyang Technological University, Singapore
More informationGaussian Mixture Model Uncertainty Learning (GMMUL) Version 1.0 User Guide
Gaussian Mixture Model Uncertainty Learning (GMMUL) Version 1. User Guide Alexey Ozerov 1, Mathieu Lagrange and Emmanuel Vincent 1 1 INRIA, Centre de Rennes - Bretagne Atlantique Campus de Beaulieu, 3
More informationConvention Paper Presented at the 128th Convention 2010 May London, UK
Audio Engineering Society Convention Paper Presented at the 128th Convention 2010 May 22 25 London, UK 8130 The papers at this Convention have been selected on the basis of a submitted abstract and extended
More informationESTIMATION OF RELATIVE TRANSFER FUNCTION IN THE PRESENCE OF STATIONARY NOISE BASED ON SEGMENTAL POWER SPECTRAL DENSITY MATRIX SUBTRACTION
ESTIMATION OF RELATIVE TRANSFER FUNCTION IN THE PRESENCE OF STATIONARY NOISE BASED ON SEGMENTAL POWER SPECTRAL DENSITY MATRIX SUBTRACTION Xiaofei Li 1, Laurent Girin 1,, Radu Horaud 1 1 INRIA Grenoble
More informationSPEECH enhancement has been studied extensively as a
JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2017 1 Phase-Aware Speech Enhancement Based on Deep Neural Networks Naijun Zheng and Xiao-Lei Zhang Abstract Short-time frequency transform STFT)
More informationSNR Features for Automatic Speech Recognition
SNR Features for Automatic Speech Recognition Philip N. Garner Idiap Research Institute Martigny, Switzerland pgarner@idiap.ch Abstract When combined with cepstral normalisation techniques, the features
More informationALGONQUIN - Learning dynamic noise models from noisy speech for robust speech recognition
ALGONQUIN - Learning dynamic noise models from noisy speech for robust speech recognition Brendan J. Freyl, Trausti T. Kristjanssonl, Li Deng 2, Alex Acero 2 1 Probabilistic and Statistical Inference Group,
More information3. ESTIMATION OF SIGNALS USING A LEAST SQUARES TECHNIQUE
3. ESTIMATION OF SIGNALS USING A LEAST SQUARES TECHNIQUE 3.0 INTRODUCTION The purpose of this chapter is to introduce estimators shortly. More elaborated courses on System Identification, which are given
More informationJoint Filtering and Factorization for Recovering Latent Structure from Noisy Speech Data
Joint Filtering and Factorization for Recovering Latent Structure from Noisy Speech Data Colin Vaz, Vikram Ramanarayanan, and Shrikanth Narayanan Ming Hsieh Department of Electrical Engineering University
More informationRAO-BLACKWELLISED PARTICLE FILTERS: EXAMPLES OF APPLICATIONS
RAO-BLACKWELLISED PARTICLE FILTERS: EXAMPLES OF APPLICATIONS Frédéric Mustière e-mail: mustiere@site.uottawa.ca Miodrag Bolić e-mail: mbolic@site.uottawa.ca Martin Bouchard e-mail: bouchard@site.uottawa.ca
More informationSingle Channel Signal Separation Using MAP-based Subspace Decomposition
Single Channel Signal Separation Using MAP-based Subspace Decomposition Gil-Jin Jang, Te-Won Lee, and Yung-Hwan Oh 1 Spoken Language Laboratory, Department of Computer Science, KAIST 373-1 Gusong-dong,
More informationCovariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation
Covariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation Emmanuel Vincent METISS Team Inria Rennes - Bretagne Atlantique E. Vincent (Inria) Artifact reduction
More informationMonaural speech separation using source-adapted models
Monaural speech separation using source-adapted models Ron Weiss, Dan Ellis {ronw,dpwe}@ee.columbia.edu LabROSA Department of Electrical Enginering Columbia University 007 IEEE Workshop on Applications
More informationApplication of the Tuned Kalman Filter in Speech Enhancement
Application of the Tuned Kalman Filter in Speech Enhancement Orchisama Das, Bhaswati Goswami and Ratna Ghosh Department of Instrumentation and Electronics Engineering Jadavpur University Kolkata, India
More informationA priori SNR estimation and noise estimation for speech enhancement
Yao et al. EURASIP Journal on Advances in Signal Processing (2016) 2016:101 DOI 10.1186/s13634-016-0398-z EURASIP Journal on Advances in Signal Processing RESEARCH A priori SNR estimation and noise estimation
More informationA State-Space Approach to Dynamic Nonnegative Matrix Factorization
1 A State-Space Approach to Dynamic Nonnegative Matrix Factorization Nasser Mohammadiha, Paris Smaragdis, Ghazaleh Panahandeh, Simon Doclo arxiv:179.5v1 [cs.lg] 31 Aug 17 Abstract Nonnegative matrix factorization
More informationLecture Notes 5: Multiresolution Analysis
Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and
More informationA Combined Statistical and Machine Learning Approach For Single Channel Speech Enhancement
A Combined Statistical and Machine Learning Approach For Single Channel Speech Enhancement A THESIS SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY Hung-Wei Tseng IN PARTIAL
More informationTime-domain noise reduction based on an orthogonal decomposition for desired signal extraction
Time-domain noise reduction based on an orthogonal decomposition for desired signal extraction Jacob Benesty INRS-EMT, University of Quebec, 800 de la Gauchetiere Ouest, Suite 6900, Montreal, Quebec H5A
More informationMachine Recognition of Sounds in Mixtures
Machine Recognition of Sounds in Mixtures Outline 1 2 3 4 Computational Auditory Scene Analysis Speech Recognition as Source Formation Sound Fragment Decoding Results & Conclusions Dan Ellis
More informationComparison of RTF Estimation Methods between a Head-Mounted Binaural Hearing Device and an External Microphone
Comparison of RTF Estimation Methods between a Head-Mounted Binaural Hearing Device and an External Microphone Nico Gößling, Daniel Marquardt and Simon Doclo Department of Medical Physics and Acoustics
More information4 Adaptive Multimodal Fusion by Uncertainty Compensation with Application to Audio-Visual Speech Recognition
4 Adaptive Multimodal Fusion by Uncertainty Compensation with Application to Audio-Visual Speech Recognition George Papandreou, Athanassios Katsamanis, Vassilis Pitsikalis, and Petros Maragos National
More informationNOISE reduction is an important fundamental signal
1526 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 20, NO. 5, JULY 2012 Non-Causal Time-Domain Filters for Single-Channel Noise Reduction Jesper Rindom Jensen, Student Member, IEEE,
More informationMAP-BASED ESTIMATION OF THE PARAMETERS OF A GAUSSIAN MIXTURE MODEL IN THE PRESENCE OF NOISY OBSERVATIONS. Aleksej Chinaev, Reinhold Haeb-Umbach
MAP-BASED ESTIMATION OF THE PARAMETERS OF A GAUSSIAN MIXTURE MODEL IN THE PRESENCE OF NOISY OBSERVATIONS Alesej Chinaev, Reinhold Haeb-Umbach Department of Communications Engineering, University of Paderborn,
More informationA geometric approach to spectral subtraction
Available online at www.sciencedirect.com Speech Communication 5 (8) 53 66 www.elsevier.com/locate/specom A geometric approach to spectral subtraction Yang Lu, Philipos C. Loizou * Department of Electrical
More informationA SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS
Proc. of the 1 th Int. Conference on Digital Audio Effects (DAFx-9), Como, Italy, September 1-4, 9 A SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS Matteo
More informationOverview of Single Channel Noise Suppression Algorithms
Overview of Single Channel Noise Suppression Algorithms Matías Zañartu Salas Post Doctoral Research Associate, Purdue University mzanartu@purdue.edu October 4, 2010 General notes Only single-channel speech
More informationON ADVERSARIAL TRAINING AND LOSS FUNCTIONS FOR SPEECH ENHANCEMENT. Ashutosh Pandey 1 and Deliang Wang 1,2. {pandey.99, wang.5664,
ON ADVERSARIAL TRAINING AND LOSS FUNCTIONS FOR SPEECH ENHANCEMENT Ashutosh Pandey and Deliang Wang,2 Department of Computer Science and Engineering, The Ohio State University, USA 2 Center for Cognitive
More informationEMPLOYING PHASE INFORMATION FOR AUDIO DENOISING. İlker Bayram. Istanbul Technical University, Istanbul, Turkey
EMPLOYING PHASE INFORMATION FOR AUDIO DENOISING İlker Bayram Istanbul Technical University, Istanbul, Turkey ABSTRACT Spectral audio denoising methods usually make use of the magnitudes of a time-frequency
More informationIndependent Component Analysis and Unsupervised Learning. Jen-Tzung Chien
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent voices Nonparametric likelihood
More informationSINGLE-CHANNEL speech enhancement methods based
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 6, AUGUST 2007 1741 Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors Jan S.
More informationSpatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments Andreas Schwarz, Christian Huemmer, Roland Maas, Walter Kellermann Lehrstuhl für Multimediakommunikation
More informationA Low-Cost Robust Front-end for Embedded ASR System
A Low-Cost Robust Front-end for Embedded ASR System Lihui Guo 1, Xin He 2, Yue Lu 1, and Yaxin Zhang 2 1 Department of Computer Science and Technology, East China Normal University, Shanghai 200062 2 Motorola
More informationMULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka
MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION Hiroazu Kameoa The University of Toyo / Nippon Telegraph and Telephone Corporation ABSTRACT This paper proposes a novel
More informationExploring the Relationship between Conic Affinity of NMF Dictionaries and Speech Enhancement Metrics
Interspeech 2018 2-6 September 2018, Hyderabad Exploring the Relationship between Conic Affinity of NMF Dictionaries and Speech Enhancement Metrics Pavlos Papadopoulos, Colin Vaz, Shrikanth Narayanan Signal
More informationSINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli
More informationRobust Sound Event Detection in Continuous Audio Environments
Robust Sound Event Detection in Continuous Audio Environments Haomin Zhang 1, Ian McLoughlin 2,1, Yan Song 1 1 National Engineering Laboratory of Speech and Language Information Processing The University
More informationShort-Time ICA for Blind Separation of Noisy Speech
Short-Time ICA for Blind Separation of Noisy Speech Jing Zhang, P.C. Ching Department of Electronic Engineering The Chinese University of Hong Kong, Hong Kong jzhang@ee.cuhk.edu.hk, pcching@ee.cuhk.edu.hk
More informationChapter [4] "Operations on a Single Random Variable"
Chapter [4] "Operations on a Single Random Variable" 4.1 Introduction In our study of random variables we use the probability density function or the cumulative distribution function to provide a complete
More informationSpeech Enhancement Based on Statistical Modeling of Teager Energy Operated Perceptual Wavelet Packet Coefficients and Adaptive Thresholding Function
Speech Enhancement Based on Statistical Modeling of Teager Energy Operated Perceptual Wavelet Packet Coefficients and Adaptive Thresholding Function by Md. Tauhidul Islam MASTER OF SCIENCE IN ELECTRICAL
More informationSimple and efficient solutions to the problems associated with acoustic echo cancellation
Scholars' Mine Doctoral Dissertations Student Research & Creative Works Summer 2007 Simple and efficient solutions to the problems associated with acoustic echo cancellation Asif Iqbal Mohammad Follow
More informationConsistent Wiener Filtering for Audio Source Separation
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Consistent Wiener Filtering for Audio Source Separation Le Roux, J.; Vincent, E. TR2012-090 October 2012 Abstract Wiener filtering is one of
More informationTracking of Spread Spectrum Signals
Chapter 7 Tracking of Spread Spectrum Signals 7. Introduction As discussed in the last chapter, there are two parts to the synchronization process. The first stage is often termed acquisition and typically
More informationUSING STATISTICAL ROOM ACOUSTICS FOR ANALYSING THE OUTPUT SNR OF THE MWF IN ACOUSTIC SENSOR NETWORKS. Toby Christian Lawin-Ore, Simon Doclo
th European Signal Processing Conference (EUSIPCO 1 Bucharest, Romania, August 7-31, 1 USING STATISTICAL ROOM ACOUSTICS FOR ANALYSING THE OUTPUT SNR OF THE MWF IN ACOUSTIC SENSOR NETWORKS Toby Christian
More informationSPEECH ENHANCEMENT USING A LAPLACIAN-BASED MMSE ESTIMATOR OF THE MAGNITUDE SPECTRUM
SPEECH ENHANCEMENT USING A LAPLACIAN-BASED MMSE ESTIMATOR OF THE MAGNITUDE SPECTRUM APPROVED BY SUPERVISORY COMMITTEE: Dr. Philipos C. Loizou, Chair Dr. Mohammad Saquib Dr. Issa M S Panahi Dr. Hlaing Minn
More informationMAXIMUM LIKELIHOOD BASED NOISE COVARIANCE MATRIX ESTIMATION FOR MULTI-MICROPHONE SPEECH ENHANCEMENT. Ulrik Kjems and Jesper Jensen
20th European Signal Processing Conference (EUSIPCO 202) Bucharest, Romania, August 27-3, 202 MAXIMUM LIKELIHOOD BASED NOISE COVARIANCE MATRIX ESTIMATION FOR MULTI-MICROPHONE SPEECH ENHANCEMENT Ulrik Kjems
More informationIndependent Component Analysis and Unsupervised Learning
Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent
More informationCS578- Speech Signal Processing
CS578- Speech Signal Processing Lecture 7: Speech Coding Yannis Stylianou University of Crete, Computer Science Dept., Multimedia Informatics Lab yannis@csd.uoc.gr Univ. of Crete Outline 1 Introduction
More informationOn Optimal Smoothing in Minimum Statistics Based Noise Tracking
INTERSPEECH 25 On Optima Smoothing in Minimum Statistics Based Noise Tracking Aeksej Chinaev, Reinhod Haeb-Umbach Department of Communications Engineering, University of Paderborn, 3398 Paderborn, Germany
More informationEfficient Target Activity Detection Based on Recurrent Neural Networks
Efficient Target Activity Detection Based on Recurrent Neural Networks D. Gerber, S. Meier, and W. Kellermann Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) Motivation Target φ tar oise 1 / 15
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short
More informationE&CE 358, Winter 2016: Solution #2. Prof. X. Shen
E&CE 358, Winter 16: Solution # Prof. X. Shen Email: xshen@bbcr.uwaterloo.ca Prof. X. Shen E&CE 358, Winter 16 ( 1:3-:5 PM: Solution # Problem 1 Problem 1 The signal g(t = e t, t T is corrupted by additive
More informationBlind Spectral-GMM Estimation for Underdetermined Instantaneous Audio Source Separation
Blind Spectral-GMM Estimation for Underdetermined Instantaneous Audio Source Separation Simon Arberet 1, Alexey Ozerov 2, Rémi Gribonval 1, and Frédéric Bimbot 1 1 METISS Group, IRISA-INRIA Campus de Beaulieu,
More informationGaussian Processes for Audio Feature Extraction
Gaussian Processes for Audio Feature Extraction Dr. Richard E. Turner (ret26@cam.ac.uk) Computational and Biological Learning Lab Department of Engineering University of Cambridge Machine hearing pipeline
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 2, Issue 11, November 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Acoustic Source
More informationBlind Instantaneous Noisy Mixture Separation with Best Interference-plus-noise Rejection
Blind Instantaneous Noisy Mixture Separation with Best Interference-plus-noise Rejection Zbyněk Koldovský 1,2 and Petr Tichavský 1 1 Institute of Information Theory and Automation, Pod vodárenskou věží
More informationIEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 18, NO. 3, MARCH 2010 589 Gamma Markov Random Fields for Audio Source Modeling Onur Dikmen, Member, IEEE, and A. Taylan Cemgil, Member,
More informationBayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement Patrick J. Wolfe Department of Engineering University of Cambridge Cambridge CB2 1PZ, UK pjw47@eng.cam.ac.uk Simon J. Godsill
More informationNoise Classification based on PCA. Nattanun Thatphithakkul, Boontee Kruatrachue, Chai Wutiwiwatchai, Vataya Boonpiam
Noise Classification based on PCA Nattanun Thatphithakkul, Boontee Kruatrachue, Chai Wutiwiwatchai, Vataya Boonpiam 1 Outline Introduction Principle component analysis (PCA) Classification using PCA Experiment
More informationMulti User Detection I
January 12, 2005 Outline Overview Multiple Access Communication Motivation: What is MU Detection? Overview of DS/CDMA systems Concept and Codes used in CDMA CDMA Channels Models Synchronous and Asynchronous
More informationDistributed MAP probability estimation of dynamic systems with wireless sensor networks
Distributed MAP probability estimation of dynamic systems with wireless sensor networks Felicia Jakubiec, Alejandro Ribeiro Dept. of Electrical and Systems Engineering University of Pennsylvania https://fling.seas.upenn.edu/~life/wiki/
More informationHumans do not maximize the probability of correct decision when recognizing DANTALE words in noise
INTERSPEECH 207 August 20 24, 207, Stockholm, Sweden Humans do not maximize the probability of correct decision when recognizing DANTALE words in noise Mohsen Zareian Jahromi, Jan Østergaard, Jesper Jensen
More informationAdaptiveFilters. GJRE-F Classification : FOR Code:
Global Journal of Researches in Engineering: F Electrical and Electronics Engineering Volume 14 Issue 7 Version 1.0 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals
More informationRevision of Lecture 4
Revision of Lecture 4 We have discussed all basic components of MODEM Pulse shaping Tx/Rx filter pair Modulator/demodulator Bits map symbols Discussions assume ideal channel, and for dispersive channel
More information