Digital Speech Processing Lecture 10. Short-Time Fourier Analysis Methods - Filter Bank Design

Similar documents
Multirate signal processing

convenient means to determine response to a sum of clear evidence of signal properties that are obscured in the original signal

Filter Banks II. Prof. Dr.-Ing. G. Schuller. Fraunhofer IDMT & Ilmenau University of Technology Ilmenau, Germany

Quadrature-Mirror Filter Bank

Subband Coding and Wavelets. National Chiao Tung University Chun-Jen Tsai 12/04/2014

L6: Short-time Fourier analysis and synthesis

Basic Multi-rate Operations: Decimation and Interpolation

Filter Banks II. Prof. Dr.-Ing Gerald Schuller. Fraunhofer IDMT & Ilmenau Technical University Ilmenau, Germany

Elec4621 Advanced Digital Signal Processing Chapter 11: Time-Frequency Analysis

MULTIRATE DIGITAL SIGNAL PROCESSING

Digital Image Processing

Wavelets and Multiresolution Processing

INTRODUCTION TO. Adapted from CS474/674 Prof. George Bebis Department of Computer Science & Engineering University of Nevada (UNR)

Module 4 MULTI- RESOLUTION ANALYSIS. Version 2 ECE IIT, Kharagpur

OLA and FBS Duality Review

LAB 6: FIR Filter Design Summer 2011

ECE472/572 - Lecture 13. Roadmap. Questions. Wavelets and Multiresolution Processing 11/15/11

representation of speech

Filter structures ELEC-E5410

Digital Image Processing Lectures 15 & 16

Course and Wavelets and Filter Banks. Filter Banks (contd.): perfect reconstruction; halfband filters and possible factorizations.

Chirp Transform for FFT

ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals. 1. Sampling and Reconstruction 2. Quantization

Digital Signal Processing

Stability Condition in Terms of the Pole Locations

Multiresolution image processing

! Introduction. ! Discrete Time Signals & Systems. ! Z-Transform. ! Inverse Z-Transform. ! Sampling of Continuous Time Signals

Multimedia Signals and Systems - Audio and Video. Signal, Image, Video Processing Review-Introduction, MP3 and MPEG2

Review of Fundamentals of Digital Signal Processing

Timbral, Scale, Pitch modifications

OLA and FBS Duality Review

Perfect Reconstruction Two- Channel FIR Filter Banks

Multirate Digital Signal Processing

1. Calculation of the DFT

The basic structure of the L-channel QMF bank is shown below

Discrete Fourier Transform

UNIVERSITY OF OSLO. Faculty of mathematics and natural sciences. Forslag til fasit, versjon-01: Problem 1 Signals and systems.

Filter Design Problem

Chapter 7: Filter Design 7.1 Practical Filter Terminology

Filter Analysis and Design

Discrete-Time Fourier Transform

DISCRETE-TIME SIGNAL PROCESSING

Computer Engineering 4TL4: Digital Signal Processing

! Spectral Analysis with DFT. ! Windowing. ! Effect of zero-padding. ! Time-dependent Fourier transform. " Aka short-time Fourier transform

Filter Banks with Variable System Delay. Georgia Institute of Technology. Atlanta, GA Abstract

Wavelet Transform. Figure 1: Non stationary signal f(t) = sin(100 t 2 ).

Multiscale Image Transforms

DHANALAKSHMI COLLEGE OF ENGINEERING DEPARTMENT OF ELECTRICAL AND ELECTRONICS ENGINEERING EC2314- DIGITAL SIGNAL PROCESSING UNIT I INTRODUCTION PART A

Introduction to Digital Signal Processing

1 1.27z z 2. 1 z H 2

EE123 Digital Signal Processing

Module 4. Multi-Resolution Analysis. Version 2 ECE IIT, Kharagpur

Class of waveform coders can be represented in this manner

Digital Signal Processing

Fourier analysis of discrete-time signals. (Lathi Chapt. 10 and these slides)

Voiced Speech. Unvoiced Speech

Lecture 16: Multiresolution Image Analysis

Introduction to Wavelet. Based on A. Mukherjee s lecture notes

Chapter 5 Frequency Domain Analysis of Systems

Question Paper Code : AEC11T02

Fourier Series Representation of

Multiresolution schemes

Contents. Digital Signal Processing, Part II: Power Spectrum Estimation

Chapter 7 Wavelets and Multiresolution Processing. Subband coding Quadrature mirror filtering Pyramid image processing

EE301 Signals and Systems In-Class Exam Exam 3 Thursday, Apr. 19, Cover Sheet

Discrete Time Signals and Systems Time-frequency Analysis. Gloria Menegaz

Ch.11 The Discrete-Time Fourier Transform (DTFT)

IT DIGITAL SIGNAL PROCESSING (2013 regulation) UNIT-1 SIGNALS AND SYSTEMS PART-A

Multiresolution schemes

Continuous Fourier transform of a Gaussian Function

E : Lecture 1 Introduction

INTRODUCTION TO DELTA-SIGMA ADCS

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science. Fall Solutions for Problem Set 2

Chapter 8 The Discrete Fourier Transform

The Discrete Fourier Transform

Objectives of Image Coding

ECG782: Multidimensional Digital Signal Processing

6.003: Signals and Systems. Sampling and Quantization

ECG782: Multidimensional Digital Signal Processing

MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science Discrete-Time Signal Processing Fall 2005

Review of Fundamentals of Digital Signal Processing

Digital Signal Processing

SC434L_DVCC-Tutorial 6 Subband Video Coding

ECE503: Digital Signal Processing Lecture 5

Periodic (Uniform) Sampling ELEC364 & ELEC442

Computer-Aided Design of Digital Filters. Digital Filters. Digital Filters. Digital Filters. Design of Equiripple Linear-Phase FIR Filters

Ch. 15 Wavelet-Based Compression

How to manipulate Frequencies in Discrete-time Domain? Two Main Approaches

Introduction to time-frequency analysis Centre for Doctoral Training in Healthcare Innovation

Grades will be determined by the correctness of your answers (explanations are not required).

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

3.2 Complex Sinusoids and Frequency Response of LTI Systems

ECSE 512 Digital Signal Processing I Fall 2010 FINAL EXAMINATION

Linear Convolution Using FFT

SEISMIC WAVE PROPAGATION. Lecture 2: Fourier Analysis

Introduction p. 1 Compression Techniques p. 3 Lossless Compression p. 4 Lossy Compression p. 5 Measures of Performance p. 5 Modeling and Coding p.

Chap 2. Discrete-Time Signals and Systems

Basic Design Approaches

Chapter 2: The Fourier Transform

Digital Signal Processing:

Transcription:

Digital Speech Processing Lecture Short-Time Fourier Analysis Methods - Filter Bank Design

Review of STFT j j ˆ m ˆ. X e x[ mw ] [ nˆ m] e nˆ function of nˆ looks like a time sequence function of ˆ looks like a transform j ˆ X e defined for nˆ 3,,,...; ˆ π nˆ m j ˆ. Interpretations of X e nˆ j. nˆ fixed, ˆ variable; X ˆ nˆ e DTFT x[ mw ] [ n m] DFT View OLA implementation n nˆ ˆ X e x n e w n j ˆ j ˆ n. variable, fixed; n [ ] [ ] Linear Filtering view filter bank implementation FBS implementation

Review of STFT 3. Sampling Rates in Time and Frequency ˆ j ˆ recover xn [ ] exactly from Xnˆ e j. time: We has bandwidth of BHertz B samples/sec rate FS Hamming Window: B Hz sample at L 4F S Hz or every L/4 samples L. frequency: wn [ ] is time limited to Lsamples need at least L frequency samples to avoid time aliasing 3

Review of STFT with OLA method can recover xn exactly using lower sampling rates in either time or frequency, e.g., can sample every L samples and divide by window, or can use fewer than L frequency samples filter bank channels; but these methods are highly subject to aliasing errors with any modifications to STFT can use windows LPF that are longer than L samples and still recover with N < L frequency channels; e.g., ideal LPF is infinite in time duration, but with zeros spaced N samples apart where F / N is the BW of the ideal LPF S 4

Review of STFT H e j Xe j H e j + Ye j H N- e j h [ n] w[ n] e k N % j j He H e k N hn %[ ] h[ n] δ [ n] wnpn [ ] [ ] k k j n pn [ ] N δ [ n rn] r hn %[ ] N wrn [ ] δ [ n rn] w[ ] + wn [ ] +... r k k need to design digital filters that match criteria for exact reconstruction of xn [ ] and which still work with modifications to STFT 5

Tree-Decimated Filter Banks 6

Tree-Decimated Filter Banks can sample STFT in time and frequency using lowpass filter window which is moved in jumps of R<L samples if L N where: L is the window length, R is the window shift, and N is the number of frequency channels for a given channel at k, the sampling rate of the STFT need only be twice the bandwidth of the window Fourier transform down-sample STFT estimates by a factor of R at the transmitter up-sample back to original sampling rate at the receiver final output formed by convolution of the up-sampled STFT with an appropriate lowpass filter, f [n] 7

Filter Bank Channels Fully decimated and interpolated filter bank channels; a analysis with bandpass filter, downshifting frequency and down-sampling; b synthesis with up-sampler followed by lowpass interpolation filter and frequency up-shift; c analysis with frequency down-shift followed by lowpass filter followed by down-sampling; d synthesis with up-sampling followed by frequency up-shift followed by bandpass filter. 8

Full Implementation of Analysis-Synthesis System Full implementation of STFT analysis/synthesis system with channel decimation by a factor of R, and channel interpolation by the same factor R including a box for short-time modifications. 9

Review of Down and Up-Sampling x[n] x [n] xd [ n] x[ nr] R j R j πr/ R r X e X e d aliasing addition R xn [ / R] n, ±, ±,... xu[ n] otherwise j jr X e X e u imaging removal

Filter Bank Reconstruction assuming no short time modifications k X k X e w rr m x m e rr N k yn PkY rr e e N k N Pk [ XrR[ kfn ] [ rr] ] e N k rr j j m m W rr j j n xm wrr mfn rr Pke m WrR r N k recognizing that when Pk, k N j n m N k e δ n m qn k q k j n k k N j n m can show that the condition for yn xn, nis r wrr n+ qnfn rr δ q m n k

Filter Bank Reconstruction frequency domain equations, assuming: jkn jkn h [ n] w[ n] e ; g [ n] f[ n] e ; k k k jk j k jk j k ; ; H e W e G e F e k R l jk N j Ye PkG k e N R j Xe RN π π j l j l R R Hk e X e N k k jk j PkG e H e + R π N π j l j j R l k R Xe PkGk e Hk e l RN k j j He % Xe + aliasing terms k k

Filter Bank Reconstruction conditions for perfect reconstruction ˆ j j Ye Xe N j j k j He % PkG k e Hk e RN k and RN N k π j l R jk PkG e H e for l,,..., R k Flat Gain k Alias Cancellation 3

Filter Bank Reconstruction 4

Decimated Analysis- Synthesis Filter Bank practical solution for the case wn fn, N R -band solution where the conditions for exact reconstruction become 4 and j j j π j π P F e W e + P F e W e j j π j π j π P F e W e 4 + P F e W e j j aliasing cancels exactly if P P with F e W e giving We We e 4 yn ˆ xn M j j π jm 5

Decimated Analysis- Synthesis Filter Bank when aliasing is completely eliminated by suitable choice of filters, the overall analysis-synthesis system has frequency response: where N j j k PkW [ ] e e RN k He % j j j We e W e F e h% [ n] w [ n] p[ n] where e w [ n] w[ n] f[ n] e N jk pn [ ] Pk [ ] e RN n k 6

Decimated Analysis- Synthesis Filter Bank To design a decimated analysis/synthesis filter bank system, need to determine the following:. the number of channels, N, and the decimation/interpolation ratio R. Usually determined by the desired frequency resolution. the window, w[n], and the interpolation filter impulse response, f [n]. These are lowpass filters that should have good frequency-selective properties such that the bandpass channel responses do not overlap into more than one band on either side of the channel. 3. the complex gain factors, P[k], k,,,n-. These constants are important for achieving flat overall response and alias cancellation. 7

Maximally Decimated Filter Banks we show here that it is possible to obtain exact reconstruction with RN and N<L. RN is termed maximal decimation because this is the largest decimation factor that can be used with an N- channel filter bank and still achieve exact reconstruction. nominal bandwidth of the bandpass channel signal is increased from ±π/n to ±π down-sampling by N accomplishes frequency downshifting normally accomplished by modulation 8

Maximally Decimated Filter Banks 9

Analysis-Synthesis Filter Bank all modulators removed

Two Channel Filter Banks

Two Channel Filter Banks For this two-band system, we have: R N ; πk, k, k Applying the frequency domain analysis we get: j j j j j j Ye G e H e + G e H e Xe j j π j j π j π + G e H e + G e H e X e not including multiplier / N or the gain factors P[], P[] Aliasing can be eliminated completely by choosing the filters such that: j j π j j π G e H e + G e H e aliasing cancelation Perfect reconstruction requires: j j j j + flat gain condition G e H e G e H e

Quadrature Mirror Filter QMF Banks Alias cancellation achieved if the filters are chosen to satisfy the conditions: h n e h n H e H e jπn j j π [ ] [ ] j g [ n] h [ n] G e H e j g n h n G e H e j j π [ ] [ ] Note that there is only a single lowpass filter, with impulse response h [ ] n on which all other filters are based; filter has nominal cutoff of π / rad with a narrow transition to a stopband with adequate attenuation to isolate the two bands The filter h[ n] e h [ n] is the complementary highpass filter jπ n The filters h [ n] and h[ n] are called Quadrature Mirror Filters QMF 3

Quadrature Mirror Filter QMF Banks Substituting for the filter relationships we get: H e H e H e H e j j π j π j π i.e., perfect alias cancellation for any choice of lowpass filter j j j π Y e H e H e j % j j X e H e X e For perfect reconstruction we require that: j j π jm H e H e e i.e., flat magnitude with delay of M samples, due to the delay of the causal filters of the filter bank 4

Quadrature Mirror Filters Assume FIR lowpass filter of length L samples such that h [ n] h [ L n], n L ; this filter has linear phase with a delay of L / samples and a frequency response of the form: H e A e j j j L / e L j where A e is real. Using this lowpass filter in the must be odd filter bank, the condition for perfect reconstruction becomes: He % A e e A e e If L is odd, we get: j j jπ L j π j L j j j j L H% π e A e A e + e linear phase is guaranteed, but not perfect reconstruction 5

Quadrature Mirror Filters For flat gain we need to satisfy the condition: j j A e A e π + Solution is to use high order linear phase QMF filters Johnston proposed an algorithm for design of the basic lowpass filter that minimized the deviation from unity while providing a desired level of stopband attenuation 6

Quadrature Mirror Filters π h[ n] w[ n] g[ n] j n h[ n] w[ n] e g [ n] 7

Quadrature Mirror Filters Lowpass filter designed by windowing Highpass filter is mirror image of lowpass filter practical realization of QMF filters note that aliasing is cancelled, but the overall frequency response is not perfectly flat 8

Quadrature Mirror Filters 9

Issues with Perfect Reconstruction The motivation for the subband approach is to process different speech properties independently in each band and thus being able to localize the band compact bandwidth is important. Also,based on the original motivation of short-time analysis, temporal focus within a limited time duration is also important. Filter bank design is thus result of a trade-off between perfect reconstruction and bandwidth concentration in a joint criterion: J α π s j j W e d + α T e d Stop-band leakage Deviation from PR π When processing e.g., quantization, a non-linear operation is involved, harmonic distortions are inevitable and perfect reconstruction cannot be achieved, as seen in the simple example. There exists a design procedure Smith-Barnwell for QMF filters that attempts to meet the design constraints as best as possible 3

3. Start with a sequence of length, say, L- L even that satisfies that is, even samples, except at n., the value of the sequence at n, needs to be large enough to satisfy the positivity condition see below.. Factor such that all roots of are inside the unit circle. 3. Positivity condition: 4. The mirror channel: 5. To eliminate aliasing: Smith-Barnwell QMF Design { } z W z W n g z G Z C ; n C n g n g n δ + W z > j j j j e W e W e W e G, L N e e W e W N j j j or equivalently n N w n w z z W z W n N or equivalently n w n f z W z F n or equivalently n w n f z W z F n and z W z F z W z F

Smith-Barnwell QMF Design - Example L 8, N L 7 G z { g n } W z W Z z Length of w n L, then g n has length L choose sin πn / g n, n, ±, ±, L, ± 7 πn g n [-.455.6 Factorize G z W z W z w n [.89 n w n w N n [-.46 f.383 at n.346 -.8...637,. is added -.33 -.6 to -.3 -. -. satisfy th -.6.637 e positivity.59.8.. condition.383 W z :length L with all roots inside the unit circle.59.3 -.33 -.346 -.455] -.46].89 n f n w n [-.46.8.59 -.3 -.33.346.89 ] n n w n [.89 -.346 -.33.3.59 -.8 -.46] all arrays start with n -] 3

QMF Frequency Responses Example Magnitude db Phase degree, x 5-5 - -5 - - -4-6 -8 - - Normalized frequency xπ 33

Tree-Structured Filter Banks 34

Tree-Structured Filter Banks 35

Sampling Pattern for Tree- Structured Filter Banks 36

Tree-Structured Filter Banks 37

Tree-Structured QMF Filterbank 38

Quantized Channel Signals δ [n] h [ ] n e [ n ] g [ n] δ [ n ] yˆ [ n] yˆ [ n] δ [ n ] h [ ] g [ n ] n e [ n ] δ [n] yˆ n [ ] Quantization noise is concentrated in the band that it is generated in. j j + j e e σ σ P e G e G e ˆ y 39

Subband Coding advantages of subband coding each subband can be encoded according to the perceptual criterion appropriate to that band quantization noise can be confined to the band that produces it low energy bands can be encoded so as to produce less noise applicable to coding in the 9.6-6 Kbps range 4

Practical Example of Subband Coder subband coder using tree-structured filer bank Crochiere, 98 4

Subband Coder Subjective Quality 4

Subband Coder Waveforms 43

Concept of Time-Frequency Resolution When interested in the changing characteristics of the signal, short time Fourier transform or short time spectral analysis is used. j F, t f τ g t τ e τ dτ Temporal spread : σ t gt is the window function providing focus in time. t t c g t dt g t Frequency spread : σ c G d G d If g t is Gaussian, it achieves the minimum time - bandwidth product σ t σ.5. Continuous prolate spheroidal wave function: a bandlimited signal that have the highest energy concentration in a specified time interval. dt To maintain a similar level of uncertainty, the window function should be short to provide time resolution for high frequency components and long to provide frequency resolution for low frequency components. 44

Wavelet-Based Methods Analysis Filters Subband method Synthesis Filters Band R Q Q - R Band Band R Q Q - R Band........ Band N R Q Q - R Band N Analysis Filters Wavelet Transform Wavelet method R Q Q - R R Q Q - R R Q Q - R Synthesis Filters Inverse Transform What is in the Wavelet Transform block? What is the difference? 45

Wavelet Transform Built on set of expansion or basis functions that are derived from a mother wavelet through scaling and translation: t τ Let ψ t be a mother wavelet : ψ τ t ψ Continuous Wavelet Transform: * * t τ * τ F w a, τ f t ψ a, τ t dt f t ψ dt f τ * ψ a a a a * τ F w a, τ f at ψ t dt stretch or compress the signal for analysis a a a, τ scalogram; F, τ spectrogram Let Ψ F{ ψ t} and Ψ a, τ F{ ψ a, τ t} Inverse transform: Admissibility condition: f t F w a, Fw a, τ ψ a, τ t Cψ a Ψ C ψ d < a dadτ a Ψ 46

47 Wavelet and Fourier Transform a f a dt a t t f a a F w τ ψ τ τ ψ τ * * *, } { } {,, t t a a τ τ ψ ψ F and F Ψ Ψ } { t f F F τ τ τ τ ψ ψ j a a e a a a t a t Ψ Ψ,, *,, * τ ψ τ τ a af a f a a F a W w Ψ F F transform. the wavelet is the frequency domain representation of, a W

Wavelet and Its Spectrum Ψt σ Ψ t / a aσ, < a < / a σ a σ,< a / a / a 48

Scaling and Translation For discrete wavelet transform, a m m a and τ nτ a ψ a, τ m n τ If a ψ and τ m/ m m, n t ψ t and, m, n are integers m/ m t ψ, t a ψ a t n m, n Z n m, n Z If the set {, t} is complete, it is called affine wavelets. F ψ m n Orthogonality condition : ψmn t ψ pq t dt δ m p δ n q m/ m w a, τ wm, n a f t ψ a t nτ dt f w m t t, nψ m, n m n *,, dyadic or octave sampling The orthogonality condition may not be easy to satisfy for a general set of wavelets; but it is still possible to invert discrete wavelet transform, sometimes. 49

STFT and CWT 3 5 4 8 frequency time frequency time Constant bandwidth Constant-Q bandwidth 5

Discrete Wavelet Transforms 5

Other Simpler Transforms or Filter Banks Short-time analysis can also be viewed as data transformation, executed in a block-by-block manner, using for example: Discrete Cosine Transform Discrete Sine Transform Filter bank based on DFT/FFT General remarks: Discussion so far aims at a filter bank framework that allows perfect reconstruction i.e., no loss of information in the original signal if desired; But many speech processing systems do not require perfect reconstruction; rather, one may want to focus on extraction of key parameters or features from the speech signal; Therefore, other analysis methods e.g., transformations that do not have inverse may still apply. 5

DFT and DCT N-point DFT Discrete Fourier Transform X k N n x n e jπ / N kn N ~ j N k j N kn x π / π / n X e e x n + rn N k r implies Possible excessive discontinuity at edges N-point symmetric N-point DFT N-point DCT Relates to real part of N-point DFT when the signal is symmetric X k N n x n e jπ / N kn and x n xn n X k N n x ncosπkn / N But, be careful about the point of symmetry!! 53

This image cannot currently be displayed. Discrete Cosine Transforms Extensively used in audio, image and video N knπ c x + x N + x ncos n N k DCT-I k DCT-II c k c kπ N x n cos n + n N k Most common DCT; equivalent up to a scale factor of to a DFT of 4N real inputs of even symmetry with the even-indexed elements set to zero; i.e., an artificial shift of half sample in the original sampling setup. DCT-III IDCT N nπ x + x n cos k + n N even around x and odd around xn ; c k is even around k and odd around k N. Still more variants. 54

Discrete Sine Transform S [ ] N s ij i, j -D case; eliminate one dimension for -D s ij j + i + π sin, i, j,,, L, N N + N + Relates to imaginary part of ~N-point DFT 3-point sequence 8-point antisymmetric sequence Sequence corresponds To 8-point DFT 55

56 Discrete Time Fourier Transform - Revisited The DTFT discrete time Fourier transform of an N- point sequence is Sample the DTFT at The result is the DFT Discrete Fourier Transform If we compute the inverse DFT, we obtain k π/nk, k,,k, N. continuous. Frequency is N n n j j e n x e X + r N k kn N j k N j rn n x e e X N n x ~ / / π π / / k X e n x e X N n kn N j N k j π π

DFT for Power Spectral Density Estimation X e j N n x n e jn P k S X e j : k th channel processing weighting function Examples of non-uniform filter banks S k S P, k k Perfect reconstruction may not be a requirement. 57