Enhancement of Noisy Speech. State-of-the-Art and Perspectives
|
|
- Paul Shepherd
- 5 years ago
- Views:
Transcription
1 Enhancement of Noisy Speech State-of-the-Art and Perspectives Rainer Martin Institute of Communications Technology (IFN) Technical University of Braunschweig July, 2003
2 Applications of Noise Reduction Hands-free telephony. Robust speech recognition. Robust speech coding (ETSI/3GPP AMR, MELPe, ITU-T 4 kbit/s codecs). Hearing aids and cochlear implants. Restoration of historic recordings. Forensic applications. 2
3 Ingredients Models of speech production Signal theory Room acoustics Psychoacoustics Models of speech perception Objective: Improve quality and intelligibility! Combine signal theoretic and perceptive approaches! 3
4 Noise Reduction in the Spectral Domain Spectral analysis noise reduction synthesis: segmentation analysis t DFT noise reduction IDFT synthesis overlap/add t Advantages of spectral processing: good separation of speech and noise decorrelation of spectral components integration of psychoacoustic models 4
5 ements Principles of Noise Reduction λ frame index k frequency bin index a priori knowledge D F T Y (λ, Ω k ) estimation of speech coefficients estimation of y(i) = noise power echnung s(i) + n(i) spectral density P nn (λ, Ω k ) Ŝ(λ, Ω k ) I D F T ŝ(i) ledge a priori knowledge 5
6 Principles of Noise Reduction lacements ise spectrum ed spectrum power / db noisy signal spectrum noisy car noise signal spectrum noisy enhanced car noise signal spectrum Frequency / Hz 6
7 Estimation of Speech Coefficients Linear estimators e.g. Wiener Filter Non-linear estimators MMSE Short Time Spectral Amplitude estimator [Ephraim & Malah, 1984, 1985] Psychoacoustic methods [Gustafsson et al. 1998] MMSE estimation based on supergaussian priors [Martin 2002] 7
8 MMSE Estimation Optimal estimate for independent real and imaginary parts: E{S Y } = E{S R Y R } + je{s I Y I } Estimation of either the real or the imaginary part: E{S Y } = Application of Bayes theorem: E{S Y } = 1 p(y ) S p(s Y )ds S p(y S )p(s )ds What is the appropriate prior density p(s )? 8
9 Some Answers and Some Questions DFT coefficients are asymptotically complex Gaussian distributed! [Brillinger, 1981] Typical frame size in mobile communications: ms < span of correlation of (voiced) speech! Do the asymptotic assumptions hold for speech signals??? No! See, e.g., [Porter and Boll, 1984]. 9
10 Prior Densities for Real and Imaginary Part Gaussian pdf: p(s ) = 1 πσs exp ( ) S2 σs 2 Wiener filter Laplacian pdf: p(s ) = 1 σ s exp ( 2 S ) σ s Gamma pdf: p(s ) = 4 ) πσ s 2 S 1 3 S 2 exp ( 2σs 10
11 Histogram of DFT Coefficients for Speech histogram, pdf dotted: Gaussian pdf dashed: Laplacian pdf solid: Gamma pdf S R 11
12 Histogram of Speech Coefficients (enlarged) 20 histogram, pdf dotted: Gaussian pdf dashed: Laplacian pdf solid: Gamma pdf S R 12
13 Histogram of DFT Coefficients for Car Noise histogram, pdf N R dotted: Gaussian pdf dashed: Laplacian pdf 13
14 Histogram of Car Coefficients (enlarged) 10 8 histogram, pdf N R dotted: Gaussian pdf dashed: Laplacian pdf 14
15 Non-linear MMSE Estimator frag replacements E{SR YR} Gamma speech pdf Wiener filter 10 log( σ2 s σ 2 n ) = +15 db 0 db 1 10 db Y R Laplacian Noise and Gamma Speech Prior σ 2 s + σ 2 n = 2 15
16 Segmental SNR Improvement (White Noise) seg. SNR after 10 g replacements enhancement 5 0 Laplace/Laplace seg. SNR before enhancement Gamma/Gauß Wiener no enhancement 16
17 Relative Improvement w.r.t. Wiener Filter g replacements segmental SNR of input signal 17
18 Background Noise PSD Estimation Methods: Voice activity detection; Soft-decision methods; Biased compensated tracking of spectral ima [Martin 1994, 2001] Assumptions: Speech and noise are statistically independent; Speech is not always present; Noise is more stationary than speech. 18
19 Minimum Statistics: Basic Principle periodogram (frequency bin k=25) smoothed periodogram (k=25) imum of smoothed periodogram db 60 rag replacements ogram (k=25) d periodogram frame index 19
20 Minimum Statistics: Bias cements n error probability density function mean error smoothed periodogram imum of D = 40 values x
21 150 = 256 Mean of Minimum E{imum} PSfrag replacements Q eq = Q eq = Q 140 = 32 D Q eq = 512 Q eq = 128 Q eq = 64 Q eq = 32 Q eq = 8 Q eq = 4 21 D: length of imum search window Q eq = 1/var{P (λ, Ω k )} norm
22 PSfrag replacements Minimum Statistics: What s New? Minimum Statistic, version 1994 fixed smoothing parameter α fixed bias compensation Minimum Statistic, version 2001 signal dependent optimal smoothing signal dependent bias compensation fast imum update
23 56 PSfrag replacements Minimum 56 Statistics (version 2001) Q 90 = 2 Q 80 = db 50 requency bin k=25) eriodogram (k=25) 40 othed periodogram db 30 frame index periodogram (frequency bin k=25) smoothed periodogram (k=25) imum of smoothed periodogram 56 Estimation of noise power spectral density without voice activity detection! frame index 23 56
24 Relative Estimation Error Speech pause: PSfrag replacements Algorithms white noise vehicular noise street noise MinStat 1994 ( α = 0.6) (0.11) (0.13) (0.21) MinStat (0.041) (0.041) (0.13) 56 Algorithms white noise vehicular noise street noise MinStat (0.14) 0.02 (0.17) (0.28) (in parentheses: variance of estimation error) Speech activity (3 without speech pauses): MinStat 1994 (α = 0.6) 0.64 (0.77) 0.77 (1.04) 0.59 (1.9) 24 56
25 PSfrag replacements Two Channel Noise Reduction x 1 (k) x 2 (k) T adaptive time delay estimation preem- phasis T h1 + - T H T H h y hppre1 (k) h1 + h w y hppre2 (k) 56 preem- phasis deem- phasis
26 ements PSfrag replacements Coherence of Noise (Diffuse Sound Field) The complex coherence γ x1 x 2 (Ω) of two signals x 1 (k) Q = and 2 x 2 (k) is defined as Φ x1 x γ x1 x 2 (Ω) = 2 (e jω ) Φx1 x 1 (e jω ) Φ x2 x 2 (e jω ). = 128 = = 512 γ x1 x (f) 2 2 d = 10 cm d = 20 cm d = 40 cm 0.5 d = 60 cm = = f 4000 = 512 khz 26
27 Coherence of Speech in a Car db γx 1 x 2 (f) PSfrag replacements f/hz power spectral density Coherence f/hz 56
28 = 128 = 256 = 512 Two Channel Noise Reduction ŝ prompt memory noise reduction 2 microphones d mic = 0.4 m s 1 + n 1 s 2 + n 2 PSfrag replacements n 28 s 56 56
29 6 = 128 = 256 = 512 PSfrag replacements First-Order Differential Microfone 5 F A A? D, A = O 56 Y (jω) = S(jω)e jω ( d 2c cos(α) ) [ 1 e jω d c(cos(α)+ ct d ) ] Y (jω) S(jω) = 2 sin ( ωd 2c 29 ( cos(α) + ct d 56 - G K = E = J E ))
30 PSfrag replacements PSfrag replacements Directivity Patterns (d m, f = 1kHz ) 56 PSfrag replacements Q = 256 Q Q = = Dipole ( Tc/d = 0), f = 1000 Hz PSfrag replacements 5dB 10dB 15dB 90 0dB Azimuth angle in degrees Q 210 = 32 PSfrag replacements Q = 256 Q Q = = dB 120 5dB Hyper Cardioid ( Tc/d = 0.34), f = 1000 Hz dB 15dB Azimuth angle in degrees Q 210 = dB 5dB 10dB 15dB Cardioid ( Tc/d = 1), f = 1000 Hz Azimuth angle in degrees dB 15dB dB 5dB Q 120= Super Cardioid ( Tc/d = 0.57), f = 1000 Hz Azimuth angle in degrees
31 PSfrag replacements Delay-and-Sum Beamformer source s(k) θ y 1 (k) y 2 (k) y 3 (k) y N (k) T 1 T 2 T 3 T N noise n l (k) i.i.d. noise: Gain G = 10 log(n) ỹ 1 (k) ỹ 2 (k) ỹ 3 (k) ỹ N (k) ŷ(k)
32 PSfrag replacements Design of Fixed Beamformers with MATLAB
33 Directivity Pattern PSfrag replacements
34 = 256 = 512 PSfrag replacements Arrays for Speech Acquisition in Cars = 128 = 256 = cm 4 cm 4 cm 5 cm 5.25 cm Y 34 X microphones 1, 2, 3, 4, 5 linear array microphones 1, 2, 7, 4, 5 planar array [Martin et al. 2001] 56 6
35 PSfrag replacements Delay-and-sum vs. Superdirective Arrays gain [db] superdirective delay-and-sum frequency [Hz] 56
36 PSfrag replacements Linear and Planar Microphone Arrays gain [db] planar, superdirective linear, superdirective frequency [Hz] 56
37 = 128 = 256 = 512 = 128 = 256 = 512 Adaptive Beamformer (GSC) N N N. E N * A = B H A H *? E C = J H E N, A = O PSfrag replacements K J E? D = A = F J E L A E I A O + =? A A H I
38 Conclusions PSfrag replacements Find better ways to exploit statistics of signals! Incorporate models of speech production Develop better background noise estimation methods Design algorithms for high quality and intelligibility 56 Exploit spatial selectivity using multiple microphones Understand processing in the auditoryq system: = 8 Enhance perceptionally important features Use perceptive models to reduce complexity of algorithms 56 38
39 Selected References PSfrag replacements
New Statistical Model for the Enhancement of Noisy Speech
New Statistical Model for the Enhancement of Noisy Speech Electrical Engineering Department Technion - Israel Institute of Technology February 22, 27 Outline Problem Formulation and Motivation 1 Problem
More informationMANY digital speech communication applications, e.g.,
406 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 2, FEBRUARY 2007 An MMSE Estimator for Speech Enhancement Under a Combined Stochastic Deterministic Speech Model Richard C.
More informationNOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION. M. Schwab, P. Noll, and T. Sikora. Technical University Berlin, Germany Communication System Group
NOISE ROBUST RELATIVE TRANSFER FUNCTION ESTIMATION M. Schwab, P. Noll, and T. Sikora Technical University Berlin, Germany Communication System Group Einsteinufer 17, 1557 Berlin (Germany) {schwab noll
More informationModifying Voice Activity Detection in Low SNR by correction factors
Modifying Voice Activity Detection in Low SNR by correction factors H. Farsi, M. A. Mozaffarian, H.Rahmani Department of Electrical Engineering University of Birjand P.O. Box: +98-9775-376 IRAN hfarsi@birjand.ac.ir
More informationOptimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator
1 Optimal Speech Enhancement Under Signal Presence Uncertainty Using Log-Spectral Amplitude Estimator Israel Cohen Lamar Signal Processing Ltd. P.O.Box 573, Yokneam Ilit 20692, Israel E-mail: icohen@lamar.co.il
More informationUSING STATISTICAL ROOM ACOUSTICS FOR ANALYSING THE OUTPUT SNR OF THE MWF IN ACOUSTIC SENSOR NETWORKS. Toby Christian Lawin-Ore, Simon Doclo
th European Signal Processing Conference (EUSIPCO 1 Bucharest, Romania, August 7-31, 1 USING STATISTICAL ROOM ACOUSTICS FOR ANALYSING THE OUTPUT SNR OF THE MWF IN ACOUSTIC SENSOR NETWORKS Toby Christian
More informationA POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL
A POSTERIORI SPEECH PRESENCE PROBABILITY ESTIMATION BASED ON AVERAGED OBSERVATIONS AND A SUPER-GAUSSIAN SPEECH MODEL Balázs Fodor Institute for Communications Technology Technische Universität Braunschweig
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 2, Issue 11, November 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Acoustic Source
More informationNon-Stationary Noise Power Spectral Density Estimation Based on Regional Statistics
Non-Stationary Noise Power Spectral Density Estimation Based on Regional Statistics Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud To cite this version: Xiaofei Li, Laurent Girin, Sharon Gannot,
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short
More informationImproved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR
Improved Speech Presence Probabilities Using HMM-Based Inference, with Applications to Speech Enhancement and ASR Bengt J. Borgström, Student Member, IEEE, and Abeer Alwan, IEEE Fellow Abstract This paper
More informationSpectral Domain Speech Enhancement using HMM State-Dependent Super-Gaussian Priors
IEEE SIGNAL PROCESSING LETTERS 1 Spectral Domain Speech Enhancement using HMM State-Dependent Super-Gaussian Priors Nasser Mohammadiha, Student Member, IEEE, Rainer Martin, Fellow, IEEE, and Arne Leijon,
More informationSource localization and separation for binaural hearing aids
Source localization and separation for binaural hearing aids Mehdi Zohourian, Gerald Enzner, Rainer Martin Listen Workshop, July 218 Institute of Communication Acoustics Outline 1 Introduction 2 Binaural
More informationA SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL
A SPEECH PRESENCE PROBABILITY ESTIMATOR BASED ON FIXED PRIORS AND A HEAVY-TAILED SPEECH MODEL Balázs Fodor Institute for Communications Technology Technische Universität Braunschweig 386 Braunschweig,
More informationNoise-Presence-Probability-Based Noise PSD Estimation by Using DNNs
Noise-Presence-Probability-Based Noise PSD Estimation by Using DNNs 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev, Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach Department of Communications
More informationSINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS
204 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP) SINGLE-CHANNEL SPEECH PRESENCE PROBABILITY ESTIMATION USING INTER-FRAME AND INTER-BAND CORRELATIONS Hajar Momeni,2,,
More informationTime-domain representations
Time-domain representations Speech Processing Tom Bäckström Aalto University Fall 2016 Basics of Signal Processing in the Time-domain Time-domain signals Before we can describe speech signals or modelling
More informationDIRECTION ESTIMATION BASED ON SOUND INTENSITY VECTORS. Sakari Tervo
7th European Signal Processing Conference (EUSIPCO 9) Glasgow, Scotland, August 4-8, 9 DIRECTION ESTIMATION BASED ON SOUND INTENSITY VECTORS Sakari Tervo Helsinki University of Technology Department of
More informationLinear Prediction 1 / 41
Linear Prediction 1 / 41 A map of speech signal processing Natural signals Models Artificial signals Inference Speech synthesis Hidden Markov Inference Homomorphic processing Dereverberation, Deconvolution
More information2D Spectrogram Filter for Single Channel Speech Enhancement
Proceedings of the 7th WSEAS International Conference on Signal, Speech and Image Processing, Beijing, China, September 15-17, 007 89 D Spectrogram Filter for Single Channel Speech Enhancement HUIJUN DING,
More informationESTIMATION OF RELATIVE TRANSFER FUNCTION IN THE PRESENCE OF STATIONARY NOISE BASED ON SEGMENTAL POWER SPECTRAL DENSITY MATRIX SUBTRACTION
ESTIMATION OF RELATIVE TRANSFER FUNCTION IN THE PRESENCE OF STATIONARY NOISE BASED ON SEGMENTAL POWER SPECTRAL DENSITY MATRIX SUBTRACTION Xiaofei Li 1, Laurent Girin 1,, Radu Horaud 1 1 INRIA Grenoble
More informationAcoustic Source Separation with Microphone Arrays CCNY
Acoustic Source Separation with Microphone Arrays Lucas C. Parra Biomedical Engineering Department City College of New York CCNY Craig Fancourt Clay Spence Chris Alvino Montreal Workshop, Nov 6, 2004 Blind
More informationBIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION. Robert Rehr, Timo Gerkmann
BIAS CORRECTION METHODS FOR ADAPTIVE RECURSIVE SMOOTHING WITH APPLICATIONS IN NOISE PSD ESTIMATION Robert Rehr, Timo Gerkmann Speech Signal Processing Group, Department of Medical Physics and Acoustics
More informationSNR Features for Automatic Speech Recognition
SNR Features for Automatic Speech Recognition Philip N. Garner Idiap Research Institute Martigny, Switzerland pgarner@idiap.ch Abstract When combined with cepstral normalisation techniques, the features
More informationFeature extraction 2
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Feature extraction 2 Dr Philip Jackson Linear prediction Perceptual linear prediction Comparison of feature methods
More informationCEPSTRAL analysis has been widely used in signal processing
162 IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, VOL. 7, NO. 2, MARCH 1999 On Second-Order Statistics and Linear Estimation of Cepstral Coefficients Yariv Ephraim, Fellow, IEEE, and Mazin Rahim, Senior
More informationClass of waveform coders can be represented in this manner
Digital Speech Processing Lecture 15 Speech Coding Methods Based on Speech Waveform Representations ti and Speech Models Uniform and Non- Uniform Coding Methods 1 Analog-to-Digital Conversion (Sampling
More informationModeling speech signals in the time frequency domain using GARCH
Signal Processing () 53 59 Fast communication Modeling speech signals in the time frequency domain using GARCH Israel Cohen Department of Electrical Engineering, Technion Israel Institute of Technology,
More informationSPEECH enhancement algorithms are often used in communication
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 25, NO. 2, FEBRUARY 2017 397 An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation Robert Rehr, Student
More informationA Priori SNR Estimation Using Weibull Mixture Model
A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev, Jens Heitkaemper, Reinhold Haeb-Umbach Department of Communications Engineering Paderborn University
More informationSPEECH ENHANCEMENT USING A LAPLACIAN-BASED MMSE ESTIMATOR OF THE MAGNITUDE SPECTRUM
SPEECH ENHANCEMENT USING A LAPLACIAN-BASED MMSE ESTIMATOR OF THE MAGNITUDE SPECTRUM APPROVED BY SUPERVISORY COMMITTEE: Dr. Philipos C. Loizou, Chair Dr. Mohammad Saquib Dr. Issa M S Panahi Dr. Hlaing Minn
More informationSingle and Multi Channel Feature Enhancement for Distant Speech Recognition
Single and Multi Channel Feature Enhancement for Distant Speech Recognition John McDonough (1), Matthias Wölfel (2), Friedrich Faubel (3) (1) (2) (3) Saarland University Spoken Language Systems Overview
More informationDigital Signal Processing
Digital Signal Processing 0 (010) 157 1578 Contents lists available at ScienceDirect Digital Signal Processing www.elsevier.com/locate/dsp Improved minima controlled recursive averaging technique using
More informationA Second-Order-Statistics-based Solution for Online Multichannel Noise Tracking and Reduction
A Second-Order-Statistics-based Solution for Online Multichannel Noise Tracking and Reduction Mehrez Souden, Jingdong Chen, Jacob Benesty, and Sofiène Affes Abstract We propose a second-order-statistics-based
More informationImproved noise power spectral density tracking by a MAP-based postprocessor
Improved noise power spectral density tracking by a MAP-based postprocessor Aleksej Chinaev, Alexander Krueger, Dang Hai Tran Vu, Reinhold Haeb-Umbach University of Paderborn, Germany March 8th, 01 Computer
More informationSound Source Tracking Using Microphone Arrays
Sound Source Tracking Using Microphone Arrays WANG PENG and WEE SER Center for Signal Processing School of Electrical & Electronic Engineering Nanayang Technological Univerisy SINGAPORE, 639798 Abstract:
More information"Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction"
"Robust Automatic Speech Recognition through on-line Semi Blind Source Extraction" Francesco Nesta, Marco Matassoni {nesta, matassoni}@fbk.eu Fondazione Bruno Kessler-Irst, Trento (ITALY) For contacts:
More informationAN INVERTIBLE DISCRETE AUDITORY TRANSFORM
COMM. MATH. SCI. Vol. 3, No. 1, pp. 47 56 c 25 International Press AN INVERTIBLE DISCRETE AUDITORY TRANSFORM JACK XIN AND YINGYONG QI Abstract. A discrete auditory transform (DAT) from sound signal to
More informationAcoustic MIMO Signal Processing
Yiteng Huang Jacob Benesty Jingdong Chen Acoustic MIMO Signal Processing With 71 Figures Ö Springer Contents 1 Introduction 1 1.1 Acoustic MIMO Signal Processing 1 1.2 Organization of the Book 4 Part I
More informationA SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS
Proc. of the 1 th Int. Conference on Digital Audio Effects (DAFx-9), Como, Italy, September 1-4, 9 A SPECTRAL SUBTRACTION RULE FOR REAL-TIME DSP IMPLEMENTATION OF NOISE REDUCTION IN SPEECH SIGNALS Matteo
More informationAPPLICATION OF MVDR BEAMFORMING TO SPHERICAL ARRAYS
AMBISONICS SYMPOSIUM 29 June 2-27, Graz APPLICATION OF MVDR BEAMFORMING TO SPHERICAL ARRAYS Anton Schlesinger 1, Marinus M. Boone 2 1 University of Technology Delft, The Netherlands (a.schlesinger@tudelft.nl)
More informationarxiv: v1 [cs.sd] 30 Oct 2015
ACE Challenge Workshop, a satellite event of IEEE-WASPAA 15 October 18-1, 15, New Paltz, NY ESTIMATION OF THE DIRECT-TO-REVERBERANT ENERGY RATIO USING A SPHERICAL MICROPHONE ARRAY Hanchi Chen, Prasanga
More informationSignal Modeling Techniques in Speech Recognition. Hassan A. Kingravi
Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction
More informationFeature extraction 1
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Feature extraction 1 Dr Philip Jackson Cepstral analysis - Real & complex cepstra - Homomorphic decomposition Filter
More informationSINGLE-CHANNEL speech enhancement methods based
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 15, NO. 6, AUGUST 2007 1741 Minimum Mean-Square Error Estimation of Discrete Fourier Coefficients With Generalized Gamma Priors Jan S.
More informationLast time: small acoustics
Last time: small acoustics Voice, many instruments, modeled by tubes Traveling waves in both directions yield standing waves Standing waves correspond to resonances Variations from the idealization give
More informationSpeaker Tracking and Beamforming
Speaker Tracking and Beamforming Dr. John McDonough Spoken Language Systems Saarland University January 13, 2010 Introduction Many problems in science and engineering can be formulated in terms of estimating
More informationTHE PROBLEMS OF ROBUST LPC PARAMETRIZATION FOR. Petr Pollak & Pavel Sovka. Czech Technical University of Prague
THE PROBLEMS OF ROBUST LPC PARAMETRIZATION FOR SPEECH CODING Petr Polla & Pavel Sova Czech Technical University of Prague CVUT FEL K, 66 7 Praha 6, Czech Republic E-mail: polla@noel.feld.cvut.cz Abstract
More informationCovariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation
Covariance smoothing and consistent Wiener filtering for artifact reduction in audio source separation Emmanuel Vincent METISS Team Inria Rennes - Bretagne Atlantique E. Vincent (Inria) Artifact reduction
More informationNoise Reduction. Two Stage Mel-Warped Weiner Filter Approach
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach Intellectual Property Advanced front-end feature extraction algorithm ETSI ES 202 050 V1.1.3 (2003-11) European Telecommunications Standards
More informationBayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement
Bayesian Estimation of Time-Frequency Coefficients for Audio Signal Enhancement Patrick J. Wolfe Department of Engineering University of Cambridge Cambridge CB2 1PZ, UK pjw47@eng.cam.ac.uk Simon J. Godsill
More informationEEM 409. Random Signals. Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Problem 2:
EEM 409 Random Signals Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Consider a random process of the form = + Problem 2: X(t) = b cos(2π t + ), where b is a constant,
More informationSpectral masking and filtering
Spectral masking and filtering Timo Gerkmann, Emmanuel Vincent To cite this version: Timo Gerkmann, Emmanuel Vincent. Spectral masking and filtering. Emmanuel Vincent; Tuomas Virtanen; Sharon Gannot. Audio
More informationDetecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf
Detecting Parametric Signals in Noise Having Exactly Known Pdf/Pmf Reading: Ch. 5 in Kay-II. (Part of) Ch. III.B in Poor. EE 527, Detection and Estimation Theory, # 5c Detecting Parametric Signals in Noise
More informationA SUBSPACE METHOD FOR SPEECH ENHANCEMENT IN THE MODULATION DOMAIN. Yu Wang and Mike Brookes
A SUBSPACE METHOD FOR SPEECH ENHANCEMENT IN THE MODULATION DOMAIN Yu ang and Mike Brookes Department of Electrical and Electronic Engineering, Exhibition Road, Imperial College London, UK Email: {yw09,
More informationMAXIMUM LIKELIHOOD BASED NOISE COVARIANCE MATRIX ESTIMATION FOR MULTI-MICROPHONE SPEECH ENHANCEMENT. Ulrik Kjems and Jesper Jensen
20th European Signal Processing Conference (EUSIPCO 202) Bucharest, Romania, August 27-3, 202 MAXIMUM LIKELIHOOD BASED NOISE COVARIANCE MATRIX ESTIMATION FOR MULTI-MICROPHONE SPEECH ENHANCEMENT Ulrik Kjems
More informationEstimating Correlation Coefficient Between Two Complex Signals Without Phase Observation
Estimating Correlation Coefficient Between Two Complex Signals Without Phase Observation Shigeki Miyabe 1B, Notubaka Ono 2, and Shoji Makino 1 1 University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki
More informationComputer Vision & Digital Image Processing
Computer Vision & Digital Image Processing Image Restoration and Reconstruction I Dr. D. J. Jackson Lecture 11-1 Image restoration Restoration is an objective process that attempts to recover an image
More informationTutorial on Blind Source Separation and Independent Component Analysis
Tutorial on Blind Source Separation and Independent Component Analysis Lucas Parra Adaptive Image & Signal Processing Group Sarnoff Corporation February 09, 2002 Linear Mixtures... problem statement...
More informationOverview of Single Channel Noise Suppression Algorithms
Overview of Single Channel Noise Suppression Algorithms Matías Zañartu Salas Post Doctoral Research Associate, Purdue University mzanartu@purdue.edu October 4, 2010 General notes Only single-channel speech
More informationIMPROVED MULTI-MICROPHONE NOISE REDUCTION PRESERVING BINAURAL CUES
IMPROVED MULTI-MICROPHONE NOISE REDUCTION PRESERVING BINAURAL CUES Andreas I. Koutrouvelis Richard C. Hendriks Jesper Jensen Richard Heusdens Circuits and Systems (CAS) Group, Delft University of Technology,
More informationLecture 7 Random Signal Analysis
Lecture 7 Random Signal Analysis 7. Introduction to Probability 7. Amplitude Distributions 7.3 Uniform, Gaussian, and Other Distributions 7.4 Power and Power Density Spectra 7.5 Properties of the Power
More informationGAUSSIANIZATION METHOD FOR IDENTIFICATION OF MEMORYLESS NONLINEAR AUDIO SYSTEMS
GAUSSIANIATION METHOD FOR IDENTIFICATION OF MEMORYLESS NONLINEAR AUDIO SYSTEMS I. Marrakchi-Mezghani (1),G. Mahé (2), M. Jaïdane-Saïdane (1), S. Djaziri-Larbi (1), M. Turki-Hadj Alouane (1) (1) Unité Signaux
More informationIMPROVEMENTS IN MODAL PARAMETER EXTRACTION THROUGH POST-PROCESSING FREQUENCY RESPONSE FUNCTION ESTIMATES
IMPROVEMENTS IN MODAL PARAMETER EXTRACTION THROUGH POST-PROCESSING FREQUENCY RESPONSE FUNCTION ESTIMATES Bere M. Gur Prof. Christopher Niezreci Prof. Peter Avitabile Structural Dynamics and Acoustic Systems
More informationAN APPROACH TO PREVENT ADAPTIVE BEAMFORMERS FROM CANCELLING THE DESIRED SIGNAL. Tofigh Naghibi and Beat Pfister
AN APPROACH TO PREVENT ADAPTIVE BEAMFORMERS FROM CANCELLING THE DESIRED SIGNAL Tofigh Naghibi and Beat Pfister Speech Processing Group, Computer Engineering and Networks Lab., ETH Zurich, Switzerland {naghibi,pfister}@tik.ee.ethz.ch
More informationDesign Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation
CENTER FOR COMPUTER RESEARCH IN MUSIC AND ACOUSTICS DEPARTMENT OF MUSIC, STANFORD UNIVERSITY REPORT NO. STAN-M-4 Design Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation
More informationApplication of the Tuned Kalman Filter in Speech Enhancement
Application of the Tuned Kalman Filter in Speech Enhancement Orchisama Das, Bhaswati Goswami and Ratna Ghosh Department of Instrumentation and Electronics Engineering Jadavpur University Kolkata, India
More informationRecent Advancements in Speech Enhancement
Recent Advancements in Speech Enhancement Yariv Ephraim and Israel Cohen 1 May 17, 2004 Abstract Speech enhancement is a long standing problem with numerous applications ranging from hearing aids, to coding
More informationMinimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features A Theoretically Consistent Approach
Minimum Mean-Square Error Estimation of Mel-Frequency Cepstral Features A Theoretically Consistent Approach Jesper Jensen Abstract In this work we consider the problem of feature enhancement for noise-robust
More informationA priori SNR estimation and noise estimation for speech enhancement
Yao et al. EURASIP Journal on Advances in Signal Processing (2016) 2016:101 DOI 10.1186/s13634-016-0398-z EURASIP Journal on Advances in Signal Processing RESEARCH A priori SNR estimation and noise estimation
More informationREAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION. Scott Rickard, Radu Balan, Justinian Rosca. Siemens Corporate Research Princeton, NJ 08540
REAL-TIME TIME-FREQUENCY BASED BLIND SOURCE SEPARATION Scott Rickard, Radu Balan, Justinian Rosca Siemens Corporate Research Princeton, NJ 84 fscott.rickard,radu.balan,justinian.roscag@scr.siemens.com
More informationSystem Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain
System Identification and Adaptive Filtering in the Short-Time Fourier Transform Domain Electrical Engineering Department Technion - Israel Institute of Technology Supervised by: Prof. Israel Cohen Outline
More informationSignal types. Signal characteristics: RMS, power, db Probability Density Function (PDF). Analogue-to-Digital Conversion (ADC).
Signal types. Signal characteristics:, power, db Probability Density Function (PDF). Analogue-to-Digital Conversion (ADC). Signal types Stationary (average properties don t vary with time) Deterministic
More informationRandom signals II. ÚPGM FIT VUT Brno,
Random signals II. Jan Černocký ÚPGM FIT VUT Brno, cernocky@fit.vutbr.cz 1 Temporal estimate of autocorrelation coefficients for ergodic discrete-time random process. ˆR[k] = 1 N N 1 n=0 x[n]x[n + k],
More informationDETECTION theory deals primarily with techniques for
ADVANCED SIGNAL PROCESSING SE Optimum Detection of Deterministic and Random Signals Stefan Tertinek Graz University of Technology turtle@sbox.tugraz.at Abstract This paper introduces various methods for
More informationSIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land
SIPCom8-1: Information Theory and Coding Linear Binary Codes Ingmar Land Ingmar Land, SIPCom8-1: Information Theory and Coding (2005 Spring) p.1 Overview Basic Concepts of Channel Coding Block Codes I:
More informationSector-Based Detection for Hands-Free Speech Enhancement in Cars
R E S E A R C H R E P O R T I D I A P Sector-Based Detection for Hands-Free Speech Enhancement in Cars Guillaume Lathoud a,b Julien Bourgeois c Jürgen Freudenberger c IDIAP RR 04-67 December 004 a IDIAP
More informationComparison between the equalization and cancellation model and state of the art beamforming techniques
Comparison between the equalization and cancellation model and state of the art beamforming techniques FREDRIK GRAN 1,*,JESPER UDESEN 1, and Andrew B. Dittberner 2 Fredrik Gran 1,*, Jesper Udesen 1,*,
More informationarxiv:math/ v1 [math.na] 12 Feb 2005
arxiv:math/0502252v1 [math.na] 12 Feb 2005 An Orthogonal Discrete Auditory Transform Jack Xin and Yingyong Qi Abstract An orthogonal discrete auditory transform (ODAT) from sound signal to spectrum is
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science : Discrete-Time Signal Processing
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.34: Discrete-Time Signal Processing OpenCourseWare 006 ecture 8 Periodogram Reading: Sections 0.6 and 0.7
More informationRobust Range-rate Estimation of Passive Narrowband Sources in Shallow Water
Robust Range-rate Estimation of Passive Narrowband Sources in Shallow Water p. 1/23 Robust Range-rate Estimation of Passive Narrowband Sources in Shallow Water Hailiang Tao and Jeffrey Krolik Department
More informationZeros of z-transform(zzt) representation and chirp group delay processing for analysis of source and filter characteristics of speech signals
Zeros of z-transformzzt representation and chirp group delay processing for analysis of source and filter characteristics of speech signals Baris Bozkurt 1 Collaboration with LIMSI-CNRS, France 07/03/2017
More informationII. Nonparametric Spectrum Estimation for Stationary Random Signals - Non-parametric Methods -
II. onparametric Spectrum Estimation for Stationary Random Signals - on-parametric Methods - - [p. 3] Periodogram - [p. 12] Periodogram properties - [p. 23] Modified periodogram - [p. 25] Bartlett s method
More informationTRINICON: A Versatile Framework for Multichannel Blind Signal Processing
TRINICON: A Versatile Framework for Multichannel Blind Signal Processing Herbert Buchner, Robert Aichner, Walter Kellermann {buchner,aichner,wk}@lnt.de Telecommunications Laboratory University of Erlangen-Nuremberg
More informationFinite Word Length Effects and Quantisation Noise. Professors A G Constantinides & L R Arnaut
Finite Word Length Effects and Quantisation Noise 1 Finite Word Length Effects Finite register lengths and A/D converters cause errors at different levels: (i) input: Input quantisation (ii) system: Coefficient
More informationSPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION
SPEECH ENHANCEMENT USING PCA AND VARIANCE OF THE RECONSTRUCTION ERROR IN DISTRIBUTED SPEECH RECOGNITION Amin Haji Abolhassani 1, Sid-Ahmed Selouani 2, Douglas O Shaughnessy 1 1 INRS-Energie-Matériaux-Télécommunications,
More informationEs e j4φ +4N n. 16 KE s /N 0. σ 2ˆφ4 1 γ s. p(φ e )= exp 1 ( 2πσ φ b cos N 2 φ e 0
Problem 6.15 : he received signal-plus-noise vector at the output of the matched filter may be represented as (see (5-2-63) for example) : r n = E s e j(θn φ) + N n where θ n =0,π/2,π,3π/2 for QPSK, and
More informationNoise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic Approximation Algorithm
EngOpt 2008 - International Conference on Engineering Optimization Rio de Janeiro, Brazil, 0-05 June 2008. Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic
More information2-D SENSOR POSITION PERTURBATION ANALYSIS: EQUIVALENCE TO AWGN ON ARRAY OUTPUTS. Volkan Cevher, James H. McClellan
2-D SENSOR POSITION PERTURBATION ANALYSIS: EQUIVALENCE TO AWGN ON ARRAY OUTPUTS Volkan Cevher, James H McClellan Georgia Institute of Technology Atlanta, GA 30332-0250 cevher@ieeeorg, jimmcclellan@ecegatechedu
More informationL29: Fourier analysis
L29: Fourier analysis Introduction The discrete Fourier Transform (DFT) The DFT matrix The Fast Fourier Transform (FFT) The Short-time Fourier Transform (STFT) Fourier Descriptors CSCE 666 Pattern Analysis
More informationEfficient Target Activity Detection Based on Recurrent Neural Networks
Efficient Target Activity Detection Based on Recurrent Neural Networks D. Gerber, S. Meier, and W. Kellermann Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) Motivation Target φ tar oise 1 / 15
More informationVideo Coding with Motion Compensation for Groups of Pictures
International Conference on Image Processing 22 Video Coding with Motion Compensation for Groups of Pictures Markus Flierl Telecommunications Laboratory University of Erlangen-Nuremberg mflierl@stanford.edu
More informationSpherical Waves, Radiator Groups
Waves, Radiator Groups ELEC-E5610 Acoustics and the Physics of Sound, Lecture 10 Archontis Politis Department of Signal Processing and Acoustics Aalto University School of Electrical Engineering November
More informationPitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum
Downloaded from vbn.aau.dk on: marts 31, 2019 Aalborg Universitet Pitch Estimation and Tracking with Harmonic Emphasis On The Acoustic Spectrum Karimian-Azari, Sam; Mohammadiha, Nasser; Jensen, Jesper
More informationX b s t w t t dt b E ( ) t dt
Consider the following correlator receiver architecture: T dt X si () t S xt () * () t Wt () T dt X Suppose s (t) is sent, then * () t t T T T X s t w t t dt E t t dt w t dt E W t t T T T X s t w t t dt
More informationCoherentDetectionof OFDM
Telematics Lab IITK p. 1/50 CoherentDetectionof OFDM Indo-UK Advanced Technology Centre Supported by DST-EPSRC K Vasudevan Associate Professor vasu@iitk.ac.in Telematics Lab Department of EE Indian Institute
More informationMusical noise reduction in time-frequency-binary-masking-based blind source separation systems
Musical noise reduction in time-frequency-binary-masing-based blind source separation systems, 3, a J. Čermá, 1 S. Arai, 1. Sawada and 1 S. Maino 1 Communication Science Laboratories, Corporation, Kyoto,
More informationInteractions of Information Theory and Estimation in Single- and Multi-user Communications
Interactions of Information Theory and Estimation in Single- and Multi-user Communications Dongning Guo Department of Electrical Engineering Princeton University March 8, 2004 p 1 Dongning Guo Communications
More informationParameter Estimation
1 / 44 Parameter Estimation Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay October 25, 2012 Motivation System Model used to Derive
More informationFrequency estimation by DFT interpolation: A comparison of methods
Frequency estimation by DFT interpolation: A comparison of methods Bernd Bischl, Uwe Ligges, Claus Weihs March 5, 009 Abstract This article comments on a frequency estimator which was proposed by [6] and
More information