Introduction Basic Audio Feature Extraction
|
|
- Shana Barrett
- 5 years ago
- Views:
Transcription
1 Introduction Basic Audio Feature Extraction Vincent Koops (with slides by Meinhard Müller) Sound and Music Technology, December 6th, November 2017
2 Today g Main modules A. Sound and music for games B. Analysis, classification, and retrieval of sound and music for media C. Generation and manipulation of sound and music for games and media 2
3 Introduction Aim of this lecture: g Show you the basics of digital signal processing g Show you the basic of audio feature extraction g You should be able to solve simple tasks using audio feature extraction toolboxes g Important concepts are marked in italics g Have some fun with sound! 3
4 Course Module Module B: g 1. Introduction to Digital Signal Processing: g From natural sound to digital audio g 2. Basic audio feature extraction: g From digital audio to features Case studies: practical examples Toolboxes, etc. 4
5 What are Features? g If we want to compare/find/classify things, we need a way to describe them in a useful way g Visual: colors, shapes, etc 5
6 What are Features? g Audio: frequencies, amplitudes: g Low/high sounds: ex. Tuba vs Flute instrument g Loud/quiet sounds: ex. Metal vs Acoustic music g (A manipulation of) these elements is called a feature. 6
7 Recap: What are Features? g Important feature properties: g Level of abstraction: Low: Frequency -> low/high music instruments Mid: Genre -> metal/country music High: Emotion -> sad/happy music g Locality: Global feature: Local feature: entire signal part of a signal (for time representation) 7
8 Book: Fundamentals of Music Processing Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: Springer, 2015 Accompanying website:
9 Fourier Transform Idea: Decompose a given signal into a superposition of sinusoids (elementary signals). Sinusoids Signal
10 Fourier Transform Each sinusoid has a physical meaning and can be described by three parameters: Sinusoids
11 Fourier Transform Each sinusoid has a physical meaning and can be described by three parameters: Sinusoids Signal
12 Fourier Transform Each sinusoid has a physical meaning and can be described by three parameters: Signal Fouier transform Frequency (Hz)
13 Fourier Transform Role of phase Magnitude Phase
14 Fourier Transform Role of phase Analysis with sinusoid having frequency 262 Hz and phase φ = Magnitude Phase
15 Fourier Transform Role of phase Analysis with sinusoid having frequency 262 Hz and phase φ = Magnitude Phase
16 Fourier Transform Role of phase Analysis with sinusoid having frequency 262 Hz and phase φ = Magnitude Phase
17 Fourier Transform Role of phase Analysis with sinusoid having frequency 262 Hz and phase φ = Magnitude Phase
18 Fourier Transform Example: Superposition of two sinusoids Frequency (Hz)
19 Fourier Transform Example: C4 played by piano Frequency (Hz)
20 Fourier Transform Example: Piano tone (C4, Hz)
21 Fourier Transform Example: Piano tone (C4, Hz) Analysis using sinusoid with 262 Hz high correlation large Fourier coefficient
22 Fourier Transform Example: Piano tone (C4, Hz) Analysis using sinusoid with 400 Hz low correlation small Fourier coefficient
23 Fourier Transform Example: Piano tone (C4, Hz) Analysis using sinusoid with 523 Hz high correlation large Fourier coefficient
24 Fourier Transform Example: C4 played by trumpet Frequency (Hz)
25 Fourier Transform Example: C4 played by violine Frequency (Hz)
26 Fourier Transform Example: C4 played by flute Frequency (Hz)
27 Fourier Transform Example: Speech Bonn Frequency (Hz)
28 Fourier Transform Example: Speech Zürich Frequency (Hz)
29 Fourier Transform Example: C-major scale (piano) Frequency (Hz)
30 Fourier Transform Example: Chirp signal Frequency (Hz)
31 Fourier Transform Signal Fourier representation Fourier transform
32 Fourier Transform Signal Fourier representation Fourier transform Tells which frequencies occur, but does not tell when the frequencies occur. Frequency information is averaged over the entire time interval. Time information is hidden in the phase
33 Fourier Transform Frequency (Hz) Frequency (Hz)
34 Short Time Fourier Transform Idea (Dennis Gabor, 1946): Consider only a small section of the signal for the spectral analysis recovery of time information Short Time Fourier Transform (STFT) Section is determined by pointwise multiplication of the signal with a localizing window function
35 Short Time Fourier Transform Frequency (Hz)
36 Short Time Fourier Transform Frequency (Hz)
37 Short Time Fourier Transform Frequency (Hz)
38 Short Time Fourier Transform Frequency (Hz)
39 Short Time Fourier Transform Frequency (Hz)
40 Short Time Fourier Transform Frequency (Hz)
41 Short Time Fourier Transform Frequency (Hz)
42 Short Time Fourier Transform Frequency (Hz)
43 Short Time Fourier Transform Window functions Rectangular window Triangular window Hann window Frequency (Hz)
44 Short Time Fourier Transform Window functions Frequency (Hz) Trade off between smoothing and ringing
45 Short Time Fourier Transform Definition Signal Window function (, ) STFT with for
46 Short Time Fourier Transform Intuition: is musical note of frequency ω centered at time t Inner product measures the correlation between the musical note and the signal Frequency (Hz)
47 Time-Frequency Representation Spectrogram Frequency (Hz)
48 Time-Frequency Representation Spectrogram Frequency (Hz) Intensity (db)
49 Time-Frequency Representation Chirp signal and STFT with Hann window of length 50 ms Frequency (Hz)
50 Time-Frequency Representation Chirp signal and STFT with box window of length 50 ms Frequency (Hz)
51 Time-Frequency Representation Time-Frequency Localization Size of window constitutes a trade-off between time resolution and frequency resolution: Large window : Small window : poor time resolution good frequency resolution good time resolution poor frequency resolution Heisenberg Uncertainty Principle: there is no window function that localizes in time and frequency with arbitrary position.
52 Time-Frequency Representation Signal and STFT with Hann window of length 20 ms Frequency (Hz)
53 Time-Frequency Representation Signal and STFT with Hann window of length 100 ms Frequency (Hz)
54 Audio Features Example: C-major scale (piano) Spectrogram Frequency (Hz)
55 Audio Features Example: Chromatic scale Frequency (Hz) Frequency (Hz) Intensity (db) Intensity (db) C1 24 Spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108
56 Audio Features Example: Chromatic scale C1 24 C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Frequency (Hz) Frequency (Hz) Intensity (db) Intensity (db) Spectrogram
57 Audio Features Model assumption: Equal-tempered scale MIDI pitches: Piano notes: p = 21 (A0) to p = 108 (C8) Concert pitch: p = 69 (A4) 440 Hz Center frequency: Hz Logarithmic frequency distribution Octave: doubling of frequency
58 Audio Features Idea: Binning of Fourier coefficients Divide up the fequency axis into logarithmically spaced pitch regions and combine spectral coefficients of each region to a single pitch coefficient.
59 Audio Features Time-frequency representation Windowing in the time domain Windowing in the frequency domain
60 Audio Features Example: Chromatic scale C1 24 C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Frequency (Hz) Frequency (Hz) Intensity (db) Intensity (db) Spectrogram
61 Audio Features Example: Chromatic scale C1 24 Spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 C8: 4186 Hz C7: 2093 Hz C6: 1046 Hz C5: 523 Hz C4: 262 Hz C3: 131 Hz Intensity (db)
62 Audio Features Example: Chromatic scale C1 24 Log-frequency spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 C8: 4186 Hz C7: 2093 Hz C6: 1046 Hz C5: 523 Hz C4: 262 Hz C3: 131 Hz Intensity (db)
63 Audio Features Chroma features Chromatic circle Shepard s helix of pitch
64 Audio Features Chroma features Human perception of pitch is periodic in the sense that two pitches are perceived as similar in color if they differ by an octave. Seperation of pitch into two components: tone height (octave number) and chroma. Chroma : 12 traditional pitch classes of the equaltempered scale. For example: Chroma C Computation: pitch features chroma features Add up all pitches belonging to the same class Result: 12-dimensional chroma vector.
65 Audio Features Chroma features
66 Audio Features Chroma features C2 C3 C4 Chroma C
67 Audio Features Chroma features C # 2 C # 3 C # 4 Chroma C #
68 Audio Features Chroma features D2 D3 D4 Chroma D
69 Audio Features Example: Chromatic scale C1 24 Log-frequency spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Pitch (MIDI note number) Intensity (db)
70 Audio Features Example: Chromatic scale C1 24 Log-frequency spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Pitch (MIDI note number) Intensity (db) Chroma C
71 Audio Features Example: Chromatic scale C1 24 Log-frequency spectrogram C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Pitch (MIDI note number) Intensity (db) Chroma C #
72 Audio Features Example: Chromatic scale C1 24 C2 36 C3 48 C4 60 C5 72 C6 84 C7 96 C8 108 Chromagram Chroma Intensity (db)
73 Audio Features Chroma features Frequency (pitch)
74 Audio Features Chroma features Frequency (pitch)
75 Audio Features Chroma features Chroma
76 Audio Features Chroma features Sequence of chroma vectors correlates to the harmonic progression Normalization to changes in dynamics makes features invariant Further denoising and smoothing Taking logarithm before adding up pitch coefficients accounts for logarithmic sensation of intensity
77 Audio Features Chroma features (normalized) Karajan Scherbakov
78 Audio Features Automatic Chord Estimation G:maj D:maj See:
79 Mid-level representation Self-distance matrix (self-similarity matrix) Common mid-level representation Comparing each frame with all other frames Each element describing the dissimilarity of two frames (or a sequence of frames) Informative patterns Stripes for repeated sequences Blocks for homogenous segments Paulus, Müller, Klapuri Audio-Based Music Structure Analysis ISMIR 2010
80 Mid-level representation Self-distance matrix examples MFCC Chroma Paulus, Müller, Klapuri Audio-Based Music Structure Analysis ISMIR 2010
81 Approaches Categorization Proposed categorization Novelty-based approaches (points of high contrast) Repetition-based approaches Homogeneity-based approaches An earlier division into Sequence approaches: There exists sequences that are repeated during the piece (stripes in SDMs) State approaches: Piece is produced by a state machine, each state produces distinct observations Paulus, Müller, Klapuri Audio-Based Music Structure Analysis ISMIR 2010
82 Paulus, Müller, Klapuri Audio-Based Music Structure Analysis ISMIR 2010
83 Audio Features There are many ways to implement chroma features Properties may differ significantly Appropriateness depends on respective application MATLAB implementations for various chroma variants
84 Tool boxes g Many tool boxes exist that can help you develop your own MIR application g Toolboxes have their pro s and con s depending on: g Language of choice: Matlab, Python, C, C++, etc. g I will only list the large frameworks, a lot of other code is available: g Use you favourite search engine! g See also: g (rather outdated) g 16
85 VAMP plugins g g You need a host and a plugin: g Sonic visualizer g Sonic annotator g A lot of plugins for high and low-level feature extraction are available g Outputs text files or XML (RDF) g Has Python bindings (VamPy) 17
86 Sonic Visualizer Waveform Spectrogram Chromagram 18
87 Matlab toolboxes g g g g g MIR toolbox g als/mirtoolbox Similarity Matrix Toolbox: g Dan Ellis Matlab resources: g Chroma toolbox g g All sorts of different chroma variants Constant Q toolbox: g 19
88 Other Frameworks g Essentia: g g C++ library g Marsyas: g g C++ (and Java) g Aubio: g g Written in C g Has Python bindings g Clam g g Written in C++ g Has Python bindings g Yaafe g g Python 20
89 Additional Literature Park, Tae Hong (2010). Introduction to digital signal processing: Computer musically speaking. World Scientific Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications ISBN: Springer,
Audio Features. Fourier Transform. Fourier Transform. Fourier Transform. Short Time Fourier Transform. Fourier Transform.
Advanced Course Computer Science Music Processing Summer Term 2010 Fourier Transform Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Audio Features Fourier Transform Fourier
More informationAudio Features. Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform. Short Time Fourier Transform
Advanced Course Computer Science Music Processing Summer Term 2009 Meinard Müller Saarland University and MPI Informatik meinard@mpi-inf.mpg.de Audio Features Fourier Transform Tells which notes (frequencies)
More informationShort-Time Fourier Transform and Chroma Features
Friedrich-Alexander-Universität Erlangen-Nürnberg Lab Course Short-Time Fourier Transform and Chroma Features International Audio Laboratories Erlangen Prof. Dr. Meinard Müller Friedrich-Alexander Universität
More informationShort-Time Fourier Transform and Chroma Features
Friedrich-Alexander-Universität Erlangen-Nürnberg Lab Course Short-Time Fourier Transform and Chroma Features International Audio Laboratories Erlangen Prof. Dr. Meinard Müller Friedrich-Alexander Universität
More informationUniversity of Colorado at Boulder ECEN 4/5532. Lab 2 Lab report due on February 16, 2015
University of Colorado at Boulder ECEN 4/5532 Lab 2 Lab report due on February 16, 2015 This is a MATLAB only lab, and therefore each student needs to turn in her/his own lab report and own programs. 1
More informationFourier Analysis of Signals
Chapter 2 Fourier Analysis of Signals As we have seen in the last chapter, music signals are generally complex sound mixtures that consist of a multitude of different sound components. Because of this
More informationFourier Analysis of Signals
Chapter 2 Fourier Analysis of Signals As we have seen in the last chapter, music signals are generally complex sound mixtures that consist of a multitude of different sound components. Because of this
More informationLecture 7: Pitch and Chord (2) HMM, pitch detection functions. Li Su 2016/03/31
Lecture 7: Pitch and Chord (2) HMM, pitch detection functions Li Su 2016/03/31 Chord progressions Chord progressions are not arbitrary Example 1: I-IV-I-V-I (C-F-C-G-C) Example 2: I-V-VI-III-IV-I-II-V
More informationNon-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology
Non-Negative Matrix Factorization And Its Application to Audio Tuomas Virtanen Tampere University of Technology tuomas.virtanen@tut.fi 2 Contents Introduction to audio signals Spectrogram representation
More informationSound Recognition in Mixtures
Sound Recognition in Mixtures Juhan Nam, Gautham J. Mysore 2, and Paris Smaragdis 2,3 Center for Computer Research in Music and Acoustics, Stanford University, 2 Advanced Technology Labs, Adobe Systems
More informationFrequency Domain Speech Analysis
Frequency Domain Speech Analysis Short Time Fourier Analysis Cepstral Analysis Windowed (short time) Fourier Transform Spectrogram of speech signals Filter bank implementation* (Real) cepstrum and complex
More informationEXAMINING MUSICAL MEANING IN SIMILARITY THRESHOLDS
EXAMINING MUSICAL MEANING IN SIMILARITY THRESHOLDS Katherine M. Kinnaird Brown University katherine kinnaird@brown.edu ABSTRACT Many approaches to Music Information Retrieval tasks rely on correctly determining
More information6.003 Signal Processing
6.003 Signal Processing Week 6, Lecture A: The Discrete Fourier Transform (DFT) Adam Hartz hz@mit.edu What is 6.003? What is a signal? Abstractly, a signal is a function that conveys information Signal
More informationChapter 17. Superposition & Standing Waves
Chapter 17 Superposition & Standing Waves Superposition & Standing Waves Superposition of Waves Standing Waves MFMcGraw-PHY 2425 Chap 17Ha - Superposition - Revised: 10/13/2012 2 Wave Interference MFMcGraw-PHY
More informationE : Lecture 1 Introduction
E85.2607: Lecture 1 Introduction 1 Administrivia 2 DSP review 3 Fun with Matlab E85.2607: Lecture 1 Introduction 2010-01-21 1 / 24 Course overview Advanced Digital Signal Theory Design, analysis, and implementation
More informationCS 188: Artificial Intelligence Fall 2011
CS 188: Artificial Intelligence Fall 2011 Lecture 20: HMMs / Speech / ML 11/8/2011 Dan Klein UC Berkeley Today HMMs Demo bonanza! Most likely explanation queries Speech recognition A massive HMM! Details
More informationL6: Short-time Fourier analysis and synthesis
L6: Short-time Fourier analysis and synthesis Overview Analysis: Fourier-transform view Analysis: filtering view Synthesis: filter bank summation (FBS) method Synthesis: overlap-add (OLA) method STFT magnitude
More informationIntroduction to Biomedical Engineering
Introduction to Biomedical Engineering Biosignal processing Kung-Bin Sung 6/11/2007 1 Outline Chapter 10: Biosignal processing Characteristics of biosignals Frequency domain representation and analysis
More informationTIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION
TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION 13 th International Conference on Digital Audio Effects Romain Hennequin, Roland Badeau and Bertrand David Telecom
More informationNonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation
Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Mikkel N. Schmidt and Morten Mørup Technical University of Denmark Informatics and Mathematical Modelling Richard
More informationCS229 Project: Musical Alignment Discovery
S A A V S N N R R S CS229 Project: Musical Alignment iscovery Woodley Packard ecember 16, 2005 Introduction Logical representations of musical data are widely available in varying forms (for instance,
More informationTopic 7. Convolution, Filters, Correlation, Representation. Bryan Pardo, 2008, Northwestern University EECS 352: Machine Perception of Music and Audio
Topic 7 Convolution, Filters, Correlation, Representation Short time Fourier Transform Break signal into windows Calculate DFT of each window The Spectrogram spectrogram(y,1024,512,1024,fs,'yaxis'); A
More informationFACTORS IN FACTORIZATION: DOES BETTER AUDIO SOURCE SEPARATION IMPLY BETTER POLYPHONIC MUSIC TRANSCRIPTION?
FACTORS IN FACTORIZATION: DOES BETTER AUDIO SOURCE SEPARATION IMPLY BETTER POLYPHONIC MUSIC TRANSCRIPTION? Tiago Fernandes Tavares, George Tzanetakis, Peter Driessen University of Victoria Department of
More informationScalable audio separation with light Kernel Additive Modelling
Scalable audio separation with light Kernel Additive Modelling Antoine Liutkus 1, Derry Fitzgerald 2, Zafar Rafii 3 1 Inria, Université de Lorraine, LORIA, UMR 7503, France 2 NIMBUS Centre, Cork Institute
More informationScattering.m Documentation
Scattering.m Documentation Release 0.3 Vincent Lostanlen Nov 04, 2018 Contents 1 Introduction 3 2 Filter bank specifications 5 3 Wavelets 7 Bibliography 9 i ii Scattering.m Documentation, Release 0.3
More informationChirp Transform for FFT
Chirp Transform for FFT Since the FFT is an implementation of the DFT, it provides a frequency resolution of 2π/N, where N is the length of the input sequence. If this resolution is not sufficient in a
More informationLecture Notes 5: Multiresolution Analysis
Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and
More informationIntroduction to Time-Frequency Distributions
Introduction to Time-Frequency Distributions Selin Aviyente Department of Electrical and Computer Engineering Michigan State University January 19, 2010 Motivation for time-frequency analysis When you
More information1. Calculation of the DFT
ELE E4810: Digital Signal Processing Topic 10: The Fast Fourier Transform 1. Calculation of the DFT. The Fast Fourier Transform algorithm 3. Short-Time Fourier Transform 1 1. Calculation of the DFT! Filter
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 12: Acoustic Feature Extraction for ASR Instructor: Preethi Jyothi Feb 13, 2017 Speech Signal Analysis Generate discrete samples A frame Need to focus on short
More informationNovelty detection. Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University
Novelty detection Juan Pablo Bello MPATE-GE 2623 Music Information Retrieval New York University Novelty detection Energy burst Find the start time (onset) of new events (notes) in the music signal. Short
More informationLECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES
LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES Abstract March, 3 Mads Græsbøll Christensen Audio Analysis Lab, AD:MT Aalborg University This document contains a brief introduction to pitch
More informationGaussian Processes for Audio Feature Extraction
Gaussian Processes for Audio Feature Extraction Dr. Richard E. Turner (ret26@cam.ac.uk) Computational and Biological Learning Lab Department of Engineering University of Cambridge Machine hearing pipeline
More informationMULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION. Hirokazu Kameoka
MULTI-RESOLUTION SIGNAL DECOMPOSITION WITH TIME-DOMAIN SPECTROGRAM FACTORIZATION Hiroazu Kameoa The University of Toyo / Nippon Telegraph and Telephone Corporation ABSTRACT This paper proposes a novel
More informationPhysics General Physics. Lecture 25 Waves. Fall 2016 Semester Prof. Matthew Jones
Physics 22000 General Physics Lecture 25 Waves Fall 2016 Semester Prof. Matthew Jones 1 Final Exam 2 3 Mechanical Waves Waves and wave fronts: 4 Wave Motion 5 Two Kinds of Waves 6 Reflection of Waves When
More informationFeature extraction 2
Centre for Vision Speech & Signal Processing University of Surrey, Guildford GU2 7XH. Feature extraction 2 Dr Philip Jackson Linear prediction Perceptual linear prediction Comparison of feature methods
More informationTime-Frequency Analysis
Time-Frequency Analysis Basics of Fourier Series Philippe B. aval KSU Fall 015 Philippe B. aval (KSU) Fourier Series Fall 015 1 / 0 Introduction We first review how to derive the Fourier series of a function.
More informationPreFEst: A Predominant-F0 Estimation Method for Polyphonic Musical Audio Signals
PreFEst: A Predominant-F0 Estimation Method for Polyphonic Musical Audio Signals Masataka Goto National Institute of Advanced Industrial Science and Technology (AIST). IT, AIST, 1-1-1 Umezono, Tsukuba,
More informationEE123 Digital Signal Processing
EE123 Digital Signal Processing Lecture 1 Time-Dependent FT Announcements! Midterm: 2/22/216 Open everything... but cheat sheet recommended instead 1am-12pm How s the lab going? Frequency Analysis with
More informationy = log b Exponential and Logarithmic Functions LESSON THREE - Logarithmic Functions Lesson Notes Example 1 Graphing Logarithms
y = log b Eponential and Logarithmic Functions LESSON THREE - Logarithmic Functions Eample 1 Logarithmic Functions Graphing Logarithms a) Draw the graph of f() = 2 b) Draw the inverse of f(). c) Show algebraically
More informationDesign Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation
CENTER FOR COMPUTER RESEARCH IN MUSIC AND ACOUSTICS DEPARTMENT OF MUSIC, STANFORD UNIVERSITY REPORT NO. STAN-M-4 Design Criteria for the Quadratically Interpolated FFT Method (I): Bias due to Interpolation
More informationInstrument-specific harmonic atoms for mid-level music representation
Instrument-specific harmonic atoms for mid-level music representation Pierre Leveau, Emmanuel Vincent, Gaël Richard, Laurent Daudet To cite this version: Pierre Leveau, Emmanuel Vincent, Gaël Richard,
More information6.003 Signal Processing
6.003 Signal Processing Week 6, Lecture A: The Discrete Fourier Transform (DFT) Adam Hartz hz@mit.edu What is 6.003? What is a signal? Abstractly, a signal is a function that conveys information Signal
More informationTimbre Similarity. Perception and Computation. Prof. Michael Casey. Dartmouth College. Thursday 7th February, 2008
Timbre Similarity Perception and Computation Prof. Michael Casey Dartmouth College Thursday 7th February, 2008 Prof. Michael Casey (Dartmouth College) Timbre Similarity Thursday 7th February, 2008 1 /
More informationIMISOUND: An Unsupervised System for Sound Query by Vocal Imitation
IMISOUND: An Unsupervised System for Sound Query by Vocal Imitation Yichi Zhang and Zhiyao Duan Audio Information Research (AIR) Lab Department of Electrical and Computer Engineering University of Rochester
More informationQUERY-BY-EXAMPLE MUSIC RETRIEVAL APPROACH BASED ON MUSICAL GENRE SHIFT BY CHANGING INSTRUMENT VOLUME
Proc of the 12 th Int Conference on Digital Audio Effects (DAFx-09 Como Italy September 1-4 2009 QUERY-BY-EXAMPLE MUSIC RETRIEVAL APPROACH BASED ON MUSICAL GENRE SHIFT BY CHANGING INSTRUMENT VOLUME Katsutoshi
More informationConvolutive Non-Negative Matrix Factorization for CQT Transform using Itakura-Saito Divergence
Convolutive Non-Negative Matrix Factorization for CQT Transform using Itakura-Saito Divergence Fabio Louvatti do Carmo; Evandro Ottoni Teatini Salles Abstract This paper proposes a modification of the
More informationCONSTRUCTING AN INVERTIBLE CONSTANT-Q TRANSFORM WITH NONSTATIONARY GABOR FRAMES
CONSTRUCTING AN INVERTIBLE CONSTANT-Q TRANSFORM WITH NONSTATIONARY GABOR FRAMES MONIKA DÖRFLER, NICKI HOLIGHAUS, THOMAS GRILL, AND GINO ANGELO VELASCO Abstract. An efficient and perfectly invertible signal
More informationSinger Identification using MFCC and LPC and its comparison for ANN and Naïve Bayes Classifiers
Singer Identification using MFCC and LPC and its comparison for ANN and Naïve Bayes Classifiers Kumari Rambha Ranjan, Kartik Mahto, Dipti Kumari,S.S.Solanki Dept. of Electronics and Communication Birla
More informationI. Signals & Sinusoids
I. Signals & Sinusoids [p. 3] Signal definition Sinusoidal signal Plotting a sinusoid [p. 12] Signal operations Time shifting Time scaling Time reversal Combining time shifting & scaling [p. 17] Trigonometric
More informationOracle Analysis of Sparse Automatic Music Transcription
Oracle Analysis of Sparse Automatic Music Transcription Ken O Hanlon, Hidehisa Nagano, and Mark D. Plumbley Queen Mary University of London NTT Communication Science Laboratories, NTT Corporation {keno,nagano,mark.plumbley}@eecs.qmul.ac.uk
More informationNoise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic Approximation Algorithm
EngOpt 2008 - International Conference on Engineering Optimization Rio de Janeiro, Brazil, 0-05 June 2008. Noise Robust Isolated Words Recognition Problem Solving Based on Simultaneous Perturbation Stochastic
More informationMusical Genre Classication
Musical Genre Classication Jan Müllers RWTH Aachen, 2015 Jan Müllers Finding Disjoint Paths 1 / 15 Musical Genres The Problem Musical Genres History Automatic Speech Regocnition categorical labels created
More informationTIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION
TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION Romain Hennequin, Roland Badeau and Bertrand David, Institut Telecom; Telecom ParisTech; CNRS LTCI Paris France romainhennequin@telecom-paristechfr
More informationLecture Wigner-Ville Distributions
Introduction to Time-Frequency Analysis and Wavelet Transforms Prof. Arun K. Tangirala Department of Chemical Engineering Indian Institute of Technology, Madras Lecture - 6.1 Wigner-Ville Distributions
More informationResponse-Field Dynamics in the Auditory Pathway
Response-Field Dynamics in the Auditory Pathway Didier Depireux Powen Ru Shihab Shamma Jonathan Simon Work supported by grants from the Office of Naval Research, a training grant from the National Institute
More informationMULTIPITCH ESTIMATION AND INSTRUMENT RECOGNITION BY EXEMPLAR-BASED SPARSE REPRESENTATION. Ikuo Degawa, Kei Sato, Masaaki Ikehara
MULTIPITCH ESTIMATION AND INSTRUMENT RECOGNITION BY EXEMPLAR-BASED SPARSE REPRESENTATION Ikuo Degawa, Kei Sato, Masaaki Ikehara EEE Dept. Keio University Yokohama, Kanagawa 223-8522 Japan E-mail:{degawa,
More informationCOMP 546, Winter 2018 lecture 19 - sound 2
Sound waves Last lecture we considered sound to be a pressure function I(X, Y, Z, t). However, sound is not just any function of those four variables. Rather, sound obeys the wave equation: 2 I(X, Y, Z,
More informationChapters 11 and 12. Sound and Standing Waves
Chapters 11 and 12 Sound and Standing Waves The Nature of Sound Waves LONGITUDINAL SOUND WAVES Speaker making sound waves in a tube The Nature of Sound Waves The distance between adjacent condensations
More information! Spectral Analysis with DFT. ! Windowing. ! Effect of zero-padding. ! Time-dependent Fourier transform. " Aka short-time Fourier transform
Lecture Outline ESE 531: Digital Signal Processing Spectral Analysis with DFT Windowing Lec 24: April 18, 2019 Spectral Analysis Effect of zero-padding Time-dependent Fourier transform " Aka short-time
More informationTIME-FREQUENCY ANALYSIS EE3528 REPORT. N.Krishnamurthy. Department of ECE University of Pittsburgh Pittsburgh, PA 15261
TIME-FREQUENCY ANALYSIS EE358 REPORT N.Krishnamurthy Department of ECE University of Pittsburgh Pittsburgh, PA 56 ABSTRACT - analysis, is an important ingredient in signal analysis. It has a plethora of
More informationTopic 3: Fourier Series (FS)
ELEC264: Signals And Systems Topic 3: Fourier Series (FS) o o o o Introduction to frequency analysis of signals CT FS Fourier series of CT periodic signals Signal Symmetry and CT Fourier Series Properties
More informationBayesian harmonic models for musical signal analysis. Simon Godsill and Manuel Davy
Bayesian harmonic models for musical signal analysis Simon Godsill and Manuel Davy June 2, 2002 Cambridge University Engineering Department and IRCCyN UMR CNRS 6597 The work of both authors was partially
More informationMachine Recognition of Sounds in Mixtures
Machine Recognition of Sounds in Mixtures Outline 1 2 3 4 Computational Auditory Scene Analysis Speech Recognition as Source Formation Sound Fragment Decoding Results & Conclusions Dan Ellis
More informationSINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS. Emad M. Grais and Hakan Erdogan
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIX FACTORIZATION AND SPECTRAL MASKS Emad M. Grais and Hakan Erdogan Faculty of Engineering and Natural Sciences, Sabanci University, Orhanli
More informationExtraction of Individual Tracks from Polyphonic Music
Extraction of Individual Tracks from Polyphonic Music Nick Starr May 29, 2009 Abstract In this paper, I attempt the isolation of individual musical tracks from polyphonic tracks based on certain criteria,
More informationLinear Prediction 1 / 41
Linear Prediction 1 / 41 A map of speech signal processing Natural signals Models Artificial signals Inference Speech synthesis Hidden Markov Inference Homomorphic processing Dereverberation, Deconvolution
More informationReference Text: The evolution of Applied harmonics analysis by Elena Prestini
Notes for July 14. Filtering in Frequency domain. Reference Text: The evolution of Applied harmonics analysis by Elena Prestini It all started with: Jean Baptist Joseph Fourier (1768-1830) Mathematician,
More informationSound 2: frequency analysis
COMP 546 Lecture 19 Sound 2: frequency analysis Tues. March 27, 2018 1 Speed of Sound Sound travels at about 340 m/s, or 34 cm/ ms. (This depends on temperature and other factors) 2 Wave equation Pressure
More informationAnalysis of polyphonic audio using source-filter model and non-negative matrix factorization
Analysis of polyphonic audio using source-filter model and non-negative matrix factorization Tuomas Virtanen and Anssi Klapuri Tampere University of Technology, Institute of Signal Processing Korkeakoulunkatu
More informationLecture 5. The Digital Fourier Transform. (Based, in part, on The Scientist and Engineer's Guide to Digital Signal Processing by Steven Smith)
Lecture 5 The Digital Fourier Transform (Based, in part, on The Scientist and Engineer's Guide to Digital Signal Processing by Steven Smith) 1 -. 8 -. 6 -. 4 -. 2-1 -. 8 -. 6 -. 4 -. 2 -. 2. 4. 6. 8 1
More information10ème Congrès Français d Acoustique
1ème Congrès Français d Acoustique Lyon, 1-16 Avril 1 Spectral similarity measure invariant to pitch shifting and amplitude scaling Romain Hennequin 1, Roland Badeau 1, Bertrand David 1 1 Institut TELECOM,
More informationPower-Scaled Spectral Flux and Peak-Valley Group-Delay Methods for Robust Musical Onset Detection
Power-Scaled Spectral Flux and Peak-Valley Group-Delay Methods for Robust Musical Onset Detection Li Su CITI, Academia Sinica, Taipei, Taiwan lisu@citi.sinica.edu.tw Yi-Hsuan Yang CITI, Academia Sinica,
More informationTemporal Modeling and Basic Speech Recognition
UNIVERSITY ILLINOIS @ URBANA-CHAMPAIGN OF CS 498PS Audio Computing Lab Temporal Modeling and Basic Speech Recognition Paris Smaragdis paris@illinois.edu paris.cs.illinois.edu Today s lecture Recognizing
More information200Pa 10million. Overview. Acoustics of Speech and Hearing. Loudness. Terms to describe sound. Matching Pressure to Loudness. Loudness vs.
Overview Acoustics of Speech and Hearing Lecture 1-2 How is sound pressure and loudness related? How can we measure the size (quantity) of a sound? The scale Logarithmic scales in general Decibel scales
More informationOrigin of Sound. Those vibrations compress and decompress the air (or other medium) around the vibrating object
Sound Each celestial body, in fact each and every atom, produces a particular sound on account of its movement, its rhythm or vibration. All these sounds and vibrations form a universal harmony in which
More information1. Personal Audio 2. Features 3. Segmentation 4. Clustering 5. Future Work
Segmenting and Classifying Long-Duration Recordings of Personal Audio Dan Ellis and Keansub Lee Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY
More informationSpeech Signal Representations
Speech Signal Representations Berlin Chen 2003 References: 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6
More informationTIME-FREQUENCY ANALYSIS: TUTORIAL. Werner Kozek & Götz Pfander
TIME-FREQUENCY ANALYSIS: TUTORIAL Werner Kozek & Götz Pfander Overview TF-Analysis: Spectral Visualization of nonstationary signals (speech, audio,...) Spectrogram (time-varying spectrum estimation) TF-methods
More informationSuperposition and Standing Waves
Physics 1051 Lecture 9 Superposition and Standing Waves Lecture 09 - Contents 14.5 Standing Waves in Air Columns 14.6 Beats: Interference in Time 14.7 Non-sinusoidal Waves Trivia Questions 1 How many wavelengths
More informationACCOUNTING FOR PHASE CANCELLATIONS IN NON-NEGATIVE MATRIX FACTORIZATION USING WEIGHTED DISTANCES. Sebastian Ewert Mark D. Plumbley Mark Sandler
ACCOUNTING FOR PHASE CANCELLATIONS IN NON-NEGATIVE MATRIX FACTORIZATION USING WEIGHTED DISTANCES Sebastian Ewert Mark D. Plumbley Mark Sandler Queen Mary University of London, London, United Kingdom ABSTRACT
More informationWavelet Transform. Figure 1: Non stationary signal f(t) = sin(100 t 2 ).
Wavelet Transform Andreas Wichert Department of Informatics INESC-ID / IST - University of Lisboa Portugal andreas.wichert@tecnico.ulisboa.pt September 3, 0 Short Term Fourier Transform Signals whose frequency
More informationJean Morlet and the Continuous Wavelet Transform
Jean Brian Russell and Jiajun Han Hampson-Russell, A CGG GeoSoftware Company, Calgary, Alberta, brian.russell@cgg.com ABSTRACT Jean Morlet was a French geophysicist who used an intuitive approach, based
More informationReview &The hardest MC questions in media, waves, A guide for the perplexed
Review &The hardest MC questions in media, waves, A guide for the perplexed Grades online pp. I have been to the Grades online and A. Mine are up-to-date B. The quizzes are not up-to-date. C. My labs are
More informationReal-Time Spectrogram Inversion Using Phase Gradient Heap Integration
Real-Time Spectrogram Inversion Using Phase Gradient Heap Integration Zdeněk Průša 1 and Peter L. Søndergaard 2 1,2 Acoustics Research Institute, Austrian Academy of Sciences, Vienna, Austria 2 Oticon
More informationConstrained Nonnegative Matrix Factorization with Applications to Music Transcription
Constrained Nonnegative Matrix Factorization with Applications to Music Transcription by Daniel Recoskie A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the
More informationMultiresolution Analysis
Multiresolution Analysis DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Frames Short-time Fourier transform
More informationFOURIER ANALYSIS. (a) Fourier Series
(a) Fourier Series FOURIER ANAYSIS (b) Fourier Transforms Useful books: 1. Advanced Mathematics for Engineers and Scientists, Schaum s Outline Series, M. R. Spiegel - The course text. We follow their notation
More informationPhysical Acoustics. Hearing is the result of a complex interaction of physics, physiology, perception and cognition.
Physical Acoustics Hearing, auditory perception, or audition is the ability to perceive sound by detecting vibrations, changes in the pressure of the surrounding medium through time, through an organ such
More informationBAYESIAN NONNEGATIVE HARMONIC-TEMPORAL FACTORIZATION AND ITS APPLICATION TO MULTIPITCH ANALYSIS
BAYESIAN NONNEGATIVE HARMONIC-TEMPORAL FACTORIZATION AND ITS APPLICATION TO MULTIPITCH ANALYSIS Daichi Saaue Tauma Otsua Katsutoshi Itoyama Hiroshi G. Ouno Graduate School of Informatics, Kyoto University
More informationBasics on 2-D 2 D Random Signal
Basics on -D D Random Signal Spring 06 Instructor: K. J. Ray Liu ECE Department, Univ. of Maryland, College Park Overview Last Time: Fourier Analysis for -D signals Image enhancement via spatial filtering
More informationEnvironmental Effects and Control of Noise Lecture 2
Noise, Vibration, Harshness Sound Quality Research Group NVH-SQ Group University of Windsor 92-455 Environmental Effects and Control of Noise Copyright 2015 Colin Novak Copyright 2015 by Colin Novak. All
More informationSPEECH COMMUNICATION 6.541J J-HST710J Spring 2004
6.541J PS3 02/19/04 1 SPEECH COMMUNICATION 6.541J-24.968J-HST710J Spring 2004 Problem Set 3 Assigned: 02/19/04 Due: 02/26/04 Read Chapter 6. Problem 1 In this problem we examine the acoustic and perceptual
More informationPattern Recognition Applied to Music Signals
JHU CLSP Summer School Pattern Recognition Applied to Music Signals 2 3 4 5 Music Content Analysis Classification and Features Statistical Pattern Recognition Gaussian Mixtures and Neural Nets Singing
More informationINTRODUCTION TO DELTA-SIGMA ADCS
ECE37 Advanced Analog Circuits INTRODUCTION TO DELTA-SIGMA ADCS Richard Schreier richard.schreier@analog.com NLCOTD: Level Translator VDD > VDD2, e.g. 3-V logic? -V logic VDD < VDD2, e.g. -V logic? 3-V
More informationIntroduction to time-frequency analysis. From linear to energy-based representations
Introduction to time-frequency analysis. From linear to energy-based representations Rosario Ceravolo Politecnico di Torino Dep. Structural Engineering UNIVERSITA DI TRENTO Course on «Identification and
More informationSingle Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification
Single Channel Music Sound Separation Based on Spectrogram Decomposition and Note Classification Hafiz Mustafa and Wenwu Wang Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey,
More informationL29: Fourier analysis
L29: Fourier analysis Introduction The discrete Fourier Transform (DFT) The DFT matrix The Fast Fourier Transform (FFT) The Short-time Fourier Transform (STFT) Fourier Descriptors CSCE 666 Pattern Analysis
More informationMusical Modulation by Symmetries. By Rob Burnham
Musical Modulation by Symmetries By Rob Burnham Intro Traditionally intervals have been conceived as ratios of frequencies. This comes from Pythagoras and his question about why consonance intervals like
More informationMusic 270a: Complex Exponentials and Spectrum Representation
Music 270a: Complex Exponentials and Spectrum Representation Tamara Smyth, trsmyth@ucsd.edu Department of Music, University of California, San Diego (UCSD) October 24, 2016 1 Exponentials The exponential
More information