Novelty detection. Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University

Size: px

Start display at page:

Download "Novelty detection. Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University"

Imogene Hoover
6 years ago
Views:

1 Novelty detection Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University

2 Novelty detection Energy burst Find the start time (onset) of new events (notes) in the music signal. Short duration Steady state Onset: single instant chosen to mark the start of the (attack) transient. Unstability

3 Applications MIR: can be used for segmentation; one layer of rhythmic analysis (e.g. beat). Computational Musicology: see the groundbreaking work at ÖFAI (1, 2) Sound manipulation and synthesis: projects/artma/index.html Computational biology?!!

4 Difficulties extended transient (e.g. violin, flute) ambiguous events (vibrato, tremolo, glissandi) polyphonies -> (a)synchronous onsets perceptual vs physical

5 Difficulties extended transient (e.g. violin, flute) ambiguous events (vibrato, tremolo, glissandi) polyphonies -> (a)synchronous onsets perceptual vs physical

6 Difficulties extended transient (e.g. violin, flute) ambiguous events (vibrato, tremolo, glissandi) polyphonies -> (a)synchronous onsets perceptual vs physical

7 Architecture Audio signal Reduction Novelty (detection) function Detection (peak picking) Onsets

8 Time-domain Onsets: often characterized by an amplitude increase Envelope following (full-wave rectification + smoothing): E 0 (m) = 1 N N/2 X n= N/2 x(n + mh) w(n) Squaring instead of rectifying, results in the local energy: E(m) = 1 N N/2 X n= N/2 (x(n + mh)) 2 w(n)

9 Time-domain

10 Time-domain We can use the derivative of energy w.r.t. time -> sharp peaks during energy rise Detectable changes in loudness are proportional to the overall loudness of the sound. Simulates the ear s perception of loudness (Klapuri, 1999)

11 Time-domain

12 Frequency-domain Impulsive noise in time -> wide band noise in frequency More noticeable in high frequencies. Linear weighting of Energy HFC(m) = 2 N N/2 X k=0 X k (m) 2 k

13 Frequency-domain We can also measure change (flux) in spectral content (Duxbury, 02). SF(m) = 2 N Use half-wave rectification to only take energy increases into account SF R (m) = 2 N N/2 X k=0 H(x) =(x + x )/2 ( X k (m) X k (m 1) ) Use the (squared) L2 norm of the flux vector P N/2 k=0 H ( X k(m) X k (m 1) ) SF RL2 (m) = 2 N N/2 X k=0 [H ( X k (m) X k (m 1) )] 2

14 Spectral Flux

15 Phase deviation The change in instantaneous frequency in bin k can be used to detect onsets (Bello, 03) k(m 1) k(m) PD(m) = 2 N = 2 N N/2 X k=0 N/2 X k=0 00 k(m) = 2 N N/2 X k=0 k(m) k(m 1) princarg ( k (m) 2 k (m 1) + k (m 2))

16 Phase deviation This function can be improved by weighting frequency bins by their magnitude (Dixon, 06): PD W (m) = 2 N N/2 X k=0 X k (m) 00 k(m) and normalized: PD WN (m) = P N/2 k=0 X k(m) 00 k P (m) N/2 k=0 X k(m)

17 Phase deviation

18 Complex domain We can combine the spectral flux and phase deviation strategies, such that: ˆX k (m) = ˆX k (m) e j ˆk(m) where, ˆX k (m) = X k (m 1) ˆk(m) =princarg (2 k (m 1) k(m 2)) such that, CD(m) = 2 N P N/2 k=0 X k(m) ˆXk (m)

19 Complex domain As before, we can use half-wave rectification to improve the function (Dixon, 06): CD(m) = 2 N P N/2 k=0 RCD k(m) where, RCD k (m) = Xk (m) ˆXk (m) if X k (m) X k (m 1) 0 otherwise

20 Complex domain

21 Peak picking The function is post-processed to facilitate peak picking: Smoothing -> decrease jaggedness Normalization -> generalization of threshold values Thresholding -> eliminate spurious peaks

22 Peak picking Adaptive thresholding is a more robust choice, typically defined as: (m) = + f( m), m L apple m apple m + L where f is a function, e.g. the local mean or median, of the detection function; β increases the window length before the peak; and α is an offset value Peak picking reduces to selecting local maxima above the threshold

23 Peak picking

24 Comparing detection functions

25 Comparing detection functions

26 Comparing detection functions

27 Benchmarking correct c false false P = positive f - f + R = negative F = c c + f + c c + f 2PR P + R

28 References Dixon, S. Onset Detection Revisited. Proceedings of the 9th International Conference on Digital Audio Effects (DAFx06), Montreal, Canada, Collins, N. A Comparison of Sound Onset Detection Algorithms with Emphasis on Psycho-Acoustically Motivated Detection Functions. Journal of the Audio Engineering Society, Bello, J.P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M. and Sandler, M.B. A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing. 13(5), Part 2, pages , September, Bello, J.P., Duxbury, C., Davies, M. and Sandler, M. On the use of phase and energy for musical onset detection in the complex domain. IEEE Signal Processing Letters. 11(6), pages , June, Klapuri, A. Sound Onset Detection by Applying Psychoacoustic Knowledge. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Phoenix, Arizona, 1999.

Novelty detection. Juan Pablo Bello EL9173 Selected Topics in Signal Processing: Audio Content Analysis NYU Poly

Novelty detection. Juan Pablo Bello EL9173 Selected Topics in Signal Processing: Audio Content Analysis NYU Poly Novelty detection Juan Pablo Bello EL9173 Selected Topics in Signal Processing: Audio Content Analysis NYU Poly Novelty detection Energy burst Find the start time (onset) of new events in the audio signal.