Audio Features. Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform

Size: px

Start display at page:

Download "Audio Features. Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform"

Josephine Marjorie Ward
5 years ago
Views:

Advanced Course Computer Science Music Processing Summer Term 2009 Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.

1 Advanced Course Computer Science Music Processing Summer Term 2009 Meinard Müller Saarland University and MPI Informatik Audio Features Fourier Transform Tells which notes (frequencies) are played, but does not tell when the notes are played Frequency information is averaged over the entire time interval Time information is hidden in the phase Windowed Fourier Transform (WFT) (STFT) (Dennis Gabor, 1946) 2 Idea: To recover time information, only a small section of the signal is used for the spectral analysis This section is determined by a window function (, ) Definition: STFT w.r.t. g of a signal Interpretation: represents a musical note of frequency which oscillates within the translated window given by Inner product measures the correlation between the signal f and the musical note. with 3 4 Box window: discontinuities at window boundaries cause artefacts in the frequency domain Triangle window 5 6

Hann window Chirp signal and STFT with box window of length

resolution and frequency resolution: Large window : poor

good time resolution poor frequency resolution Heisenberg

localizes in time and frequency with arbitrary position.

2 Hann window Chirp signal and STFT with box window of length Time-Frequency Localization Chirp signal and STFT with hann window of length 0.05 Size of window constitutes a compromise between time resolution and frequency resolution: Large window : poor time resolution good frequency resolution Small window : good time resolution poor frequency resolution Heisenberg Uncertainty Principle: there is no window function that localizes in time and frequency with arbitrary position Signal and STFT with hann window of length 0.02 Signal and STFT with hann window of length

3 Heisenberg Uncertainty Principle Information Cells Window function with Center Width IC( ) IC( ) MATLAB MATLAB function SPECTROGRAM N = window length (in samples) M = overlap (usually ) Compute DFT N for every windowed section Keep lower Fourier coefficients Example Let x be a DT Signal Sampling rate: Window length: Overlap: Hopsize: Let Sequence of spectral vectors (for each window a vector of dimension ) 15 corresponds to window 16 Example Time resolution: Frequency resolution: Model assumption: Equal tempered scale MIDI pitches: Piano notes: Concert pitch: Center frequency: Logarithmic frequency distribution Octave: doubling of frequency 17 18

4 Idea: Binning of Fourier coefficients Divide up the fequency axis into logarithmically spaced pitch regions and combine spectral coefficients of each region to a single pitch coefficient. Time-frequency representation Windowing in the time domain Windowing in the frequency domain Note MIDI pitch Center [Hz] frequency Left [Hz] boundary Right [Hz] boundary A A# B C C# D D# E F F# G G# A Width [Hz] 21 Details: Let be a spectral vector obtained from a spectrogram w.r.t. a sampling rate and a window length N. The spectral coefficient corresponds to the frequency Let be the set of coefficients assigned to a pitch Then the pitch coefficient is defined as 22 Example: A4, p = 69 Center frequency: Lower bound: Upper bound: STFT with, Example: A4, p = 69 Center frequency: Lower bound: Upper bound: STFT with, S(p = 69) 23 24

Note: For some pitches, S(p) may be empty.

Audio Representation Example: Op. 100, No.

Solution: Multi-resolution spectrograms or multirate filterbanks 25 26 Example: Op. 100, No.

5 Note: For some pitches, S(p) may be empty. This particularly holds for low notes corresponding to narrow frequency bands. Audio Representation Example: Op. 100, No. 2 by Friedrich Burgmüller Linear frequency sampling is problematic! Solution: Multi-resolution spectrograms or multirate filterbanks Example: Op. 100, No. 2 by Friedrich Burgmüller Example: Op. 100, No. 2 by Friedrich Burgmüller Frequency in Hz Intensity MIDI pitch Intensity (db) E4 C4 A3 Time in seconds 27 Time in samples 28 Example: Chromatic Scale Example: Chromatic Scale Frequency in Hz Intensity MIDI pitch Intensity (db) Time in seconds 29 Time in samples 30

6 Human perception of pitch is periodic in the sense that two pitches are perceived as similar in color if they differ by an octave. Seperate pitch into two components: tone height (octave number) and chroma. Chroma : 12 traditional pitch classes of the equaltempered scale. For example Chroma C Computation: pitch features chroma features Add up all pitches belonging to the same class Result: 12-dimensional chroma vector. Chromatic circle Shepard s helix of pitch perception Bartsch/Wakefield, IEEE Trans. Multimedia, Sequence of chroma vectors correlates to the harmonic progression Normalization makes features invariant to changes in dynamics Example: C-Major Scale Further quantization and smoothing: CENS features Taking logarithm before adding up pitch coefficients accounts for logarithmic sensation of intensity Example: Burgmüller Op. 100, No. 2 Normalization Chroma Intensity (db) Chroma Intensity Time in samples 35 Time in samples 36

Example: Bach Toccata Example: Bach Toccata Koopman Ruebsam Koopman Ruebsam 37 Feature resolution: 10 Hz 38 Example: Bach Toccata Example: Bach Toccata Koopman Ruebsam

7 Example: Bach Toccata Example: Bach Toccata Koopman Ruebsam Koopman Ruebsam 37 Feature resolution: 10 Hz 38 Example: Bach Toccata Example: Bach Toccata Koopman Ruebsam Koopman Ruebsam Feature resolution: 1 Hz 39 Feature resolution: 0.33 Hz 40 WAV Chroma CENS (10 Hz) (1 Hz) WAV Chroma CENS (10 Hz) (1 Hz) Beethoven s Fifth (Bernstein) 41 42

8 WAV Chroma CENS (10 Hz) (1 Hz) WAV Chroma CENS (10 Hz) (1 Hz) Beethoven s Fifth (Bernstein) Beethoven s Fifth (Bernstein) Beethoven s Fifth (Piano/Sherbakov) Beethoven s Fifth (Piano/Sherbakov) Brahms Hungarian Dance No Example: Zager & Evans In The Year 2525 Example: Zager & Evans In The Year 2525 How to deal with transpositions? 45 Original: 46 Example: Zager & Evans In The Year 2525 Original: Shifted: 47

Audio Features. Fourier Transform. Fourier Transform. Fourier Transform. Short Time Fourier Transform. Fourier Transform.

Audio Features. Fourier Transform. Fourier Transform. Fourier Transform. Short Time Fourier Transform. Fourier Transform. Advanced Course Computer Science Music Processing Summer Term 2010 Fourier Transform Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Audio Features Fourier Transform Fourier