SUPPLEMENTARY MATERIAL FOR THE PAPER "A PARAMETRIC MODEL AND ESTIMATION TECHNIQUES FOR THE INHARMONICITY AND TUNING OF THE PIANO"

Size: px

Start display at page:

Download "SUPPLEMENTARY MATERIAL FOR THE PAPER "A PARAMETRIC MODEL AND ESTIMATION TECHNIQUES FOR THE INHARMONICITY AND TUNING OF THE PIANO""

Elvin McKenzie
5 years ago
Views:

1 SUPPLEMENTARY MATERIAL FOR THE PAPER "A PARAMETRIC MODEL AND ESTIMATION TECHNIQUES FOR THE INHARMONICITY AND TUNING OF THE PIANO" François Rigaud and Bertrand David Institut Telecom; Telecom ParisTech; CNRS LTCI; Paris, France Laurent Daudet Institut Langevin; Paris Diderot Univ.; ESPCI ParisTech; CNRS; Paris, France and Institut Universitaire de France This document contains supplementary material for [ and is not intended to be read separately. It provides numerical results and curves for the estimation of the inharmonicity and tuning models on the full dataset pianos), mathematical demonstrations of the update rules of the proposed inharmonicity estimation algorithm and performance comparisons with a state-of-the-art algorithm.. PARAMETRIC MODEL OF PIANO TUNING.. Model reminder The beat cancellation realized when tuning an octave interval between notes indexed by m + and m) is formulated as: F ξ,φ m + ) = F ξ,φ + B ξ ρ φ + B ξ m + ) ρ φ, where B ξ and ρ φ are respectively a model on the whole compass m [, 8 from A to C8) for the inharmonicity coefficient related to the string design) and the choices of the tuner related to the octave type choice). B ξ is defined as follows: B ξ = e b B + e b T, where b B and b T are asymptotes referring to the string design on the bass and treble bridges respectively. { bt = s T m + y T, b B = s B m + y B. Since the design of piano strings that belong to the treble bridge is standardised, {s T, y T } is fixed for every kind of pianos. It has been estimated, in [, from the data of different pianos s T 9. and y T.. ξ = {s B, y B } is the set of free parameters that are dependent on the piano model. The model related to the octave type choice is given by: ρ φ = κ )) m m erf +, α where φ = {κ, m, α} and erf corresponds to the error function. κ + sets the value of the low bass asymptote, m is a This research was partially funded by the French Agence Nationale de la Recherche, PAFI project. parameter of translation along m and α rules the slope of the decrease. By setting the tuning of a reference note as for instance f m = 9) = Hz for the A) the octave interval tuning model allows to tune all the A notes of the keyboard. The rest of the compass is then deduced by a Lagrange polynomial interpolation. Finally, an extra parameter d g corresponding to a global deviation is introduced to take into account the use of tuning forks different than the A at Hz... Results on the full dataset The presented results are obtained from separate databases, representing a total of pianos: Iowa grand piano), RWC [ grand pianos) and MAPS upright piano and grand piano synthesizer using high quality samples). Among the complete dataset, the tuning model was computed for pianos that were assumed to be well-tuned by looking at the shape of their deviation from Equal Temperament ET). The other pianos were found strongly out-of-tune; for them retuning curves are proposed.... Modelling the tuning of well-tuned pianos The results for the well-tuned pianos are presented in figures,, and. The experimental data obtained from a beforehand estimation of B, F ) on isolated note recordings) are depicted as + grey markers and the model as black lines. Subfigures, and, respectively correspond to the inharmonicity coefficient B, the octave type parameter ρ, and the deviation from ET along the compass. The values of the parameters are given in table.... Tuning pianos Because the model of octave type choice ρ φ is defined for well-tuned pianos, it cannot be used to transcribe the tuning of strongly out-of-tune pianos. In this case, it is proposed to generate retuning curves deduced from a mean model of octave type choice. The model is obtained by averaging the curves ρ over three pianos RWC #, # and Iowa grand pianos, c.f. figures, and respectively) that were assumed to be well-tuned by looking at the shape of their deviation from ET curves. From this averaged data, a mean model

2 reference data: B, m [:8 data used for the estimation of the model: B, m M reference data: B, m [:8 data used for the estimation of the model: B, m M 8 data, m [,9 model reference data d model d, +d g Fig. : RWC grand piano #. inharmonicity coefficient B, octave type parameter ρ, deviation from ET along the whole compass. The data are depicted as grey + markers and the model as black lines. : 8: : : : 8 reference data d model d, +d g data, m [,9 model : Fig. : RWC grand piano #. inharmonicity coefficient B, octave type parameter ρ, deviation from ET along the whole compass. The data are depicted as grey + markers and the model as black lines. 8: : : : ρ φ is estimated, and with the purpose to give a range in which we suppose the pianos could be retuned, we define arbitrarily a high respectively low) octave type choice as ρ φ,h = ρ φ + resp. ρ φ,l = min ρ φ, )). These curves are shown in figure. Retuning curves are then given for the pianos on figures and. Subfigures and, respectively correspond to the inharmonicity coefficient B, and the deviation from ET along the compass. The values of the parameters are given in table. RWC RWC Iowa MAPS synt. s B y B κ m α d g Table : Values of the parameters for the well-tuned pianos. RWC MAPS upright s B y B κ. m.9 ᾱ. d g fixed) Table : Values of the parameters for the detuned pianos.

3 reference data: B, m [:8 data used for the estimation of the model: B, m M reference data: B, m [:8 data used for the estimation of the model: B, m M data, m [,9 model reference data d model d, +d g : 8: : : : data, m [,9 model reference data d model d, +d g Fig. : MAPS grand piano synthesizer. inharmonicity coefficient B, octave type parameter ρ, deviation from ET along the whole compass. The data are depicted as grey + markers and the model as black lines. : 8: : : : Fig. : Iowa grand piano. inharmonicity coefficient B, octave type parameter ρ, deviation from ET along the whole compass. The data are depicted as grey + markers and the model as black lines. data ρ model ρφ model ρφ,h model ρφ,l : 8: : : : Fig. : Mean octave type choice for retuning application. Plus grey markers correspond to an average of ρ over different pianos. In black line the estimated model. With circle markers resp. in dashed line) the high resp. low) octave type choice model.

4 reference data: B, m [:8 data used for the estimation of the model: B, m M reference data d tuning model for ρφ tuning model for ρφ,h tuning model for ρφ,l Fig. : RWC grand piano #. Inharmonicity curves along the compass. Actual tuning and proposed retuning. Plus grey markers correspond to the data. The model corresponding to the octave type choice ρ φ resp. ρ φ,l, ρ φ,h ) is depicted as black line resp. black dashed line, black dashed line with circle markers). AUTOMATIC ESTIMATION OF B, F ).. Optimization problem reminder It is recalled that the goal of the algorithm is to approximate an observation matrix V K T ), composed of T magnitude spectra computed on K frequency bins), by the product of two non-negative matrices W and H. W is representing a dictionary of R spectra/atoms K R), and H an activation matrix R T ). W is based on an additive model the spectrum of a note is considered as being composed by a sum of partials), in which the partial frequencies are constrained to follow an inharmonicity relation. Each spectrum of a note, indexed by r [, R, is composed of the sum of N r partials. The partial rank is denoted by n [, N r. Each partial is parametrized by its amplitude a nr and its frequency f nr. Thus, the set of parameters for a single atom is denoted by θ r = {a nr, f nr n [, N r } and the set of parameters for the dictionary is denoted by θ = {θ r r [, R}. Finally, the expression of a parametric atom is given by: Nr W θr kr = a nr g τ ), n= where f k is the frequency of the bin with index k and g τ f k ) the magnitude of the Fourier transform of the analysis window of size τ. The spectral support of g τ f k ) is limited to its main lobe to obtain a simple expression of the update rules [ and a faster optimization. For a Hanning window, the main lobe magnitude spectrum normalized to a maximal magnitude of ) is given by: g τ f k ) =, for f k [ /τ, /τ. πτ. sinπf kτ) f k τ fk The proposed cost function is: reference data: B, m [:8 data used for the estimation of the model: B, m M reference data d tuning model for ρφ tuning model for ρφ,h tuning model for ρφ,l Fig. : MAPS upright Disklavier. Inharmonicity curves along the compass. Actual tuning and proposed retuning. Plus grey markers correspond to the data. The model corresponding to the octave type choice ρ φ resp. ρ φ,l, ρ φ,h ) is depicted as black line resp. black dashed line, black dashed line with circle markers) where Cθ, γ, H) = C θ, H) + λ C f nr, γ), C θ, H) = C f nr, γ) = K k= t= R r= n= T d β V R r= W θr kr H rt ) f nr nf r + Br n ). C corresponds to a reconstruction cost function which measures the β-divergence between the observation V and the model W θ H. C is a cost function corresponding to a regularization term) that constraint the partial frequencies of the model to follow an inharmonicity relation γ = {F r, B r r [, R}). The empirical parameter λ sets the weight of the inharmonicity constraint in the global cost function... Multiplicative algorithm Multiplicative algorithms aim at decomposing the partial derivatives of a cost function, with respect to a given parameter θ, as a difference of two positive terms: Cθ ) = P θ ) Qθ ), P θ ), Qθ ) A) and at iteratively updating the corresponding parameter according to: θ θ Qθ )/P θ ) B) Since P θ ) and Qθ ) are positives, it guarantees that parameters initialized with positive values stay positives during,

5 the optimization, and that the update is performed in the steepness direction, along the parameter axis. Indeed, if the partial derivative of the cost function is positive respectively negative), then Qθ )/P θ ) is smaller resp. bigger) than and the value of the parameter is decreased resp. increased). At a stationary point, the derivative of the cost function is null so Qθ )/P θ ) =. In the general case, no proof is given for the convergence of the algorithm or even for the decrease of the criterion. However, multiplicative update rules are widely used for solving NMF minimizations because in practice it is observed that they give satisfactory results for a reasonable number of iterations, while preserving the positiveness of the decomposition... Demonstration of the multiplicative update rules The decompositions of the global cost function, with respect to all the parameters a nr, f nr, F r, B r, r [, R, n [, N r ), that lead to the update rules given in section III.A.. of [ are demonstrated in this section.... C θ, H) partial derivative decompositions The family of β-divergences is defined as: d β x y) = ββ ) xβ + β )y β βxy β ) β R \ {, }, xlog x log y) + y x) β =, x y log x y β =. C) It s derivative with relation to y is continuous in β, and given by: d β x y) y = y β y x) D) In our case x = V a time-frequency bin of the magnitude spectrogram), and y = ˆV = R W θr kr H rt is parametrized r= by θ. Thus, the partial derivative of the reconstruction cost function with relation to a specific parameter θ is given by: with C θ, H) = K T k= t= ˆV = R ˆV r= β ˆV ˆV V ), E) W θr kr H rt, F) which can be decomposed as a difference of two positive terms ˆV R = W θr R kr W θr kr H rt H rt. G) r= r= }{{}}{{} ˆV ˆV Finally, the quantity C θ, H)/ can also be expressed as a difference of two positive quantities: C θ, H) = [ ˆV β ˆV + ˆV β ˆV V }{{} [ ˆV β ˆV P θ ) V + ˆV β ˆV } {{ } Q θ ) Derivative with respect to a nr : r [, R and n [, N r ˆV = g τ ) H rt > a nr It is then chosen, ˆV = g τ ) H rt, a nr ˆV a nr =.. H) Finally, by replacing in equation H), equations ) and ) from [ are obtained: P a nr ) = [ g τ ).H rt ). V β, Q a nr ) = [ g τ ).H rt ). V β.v. Derivative with respect to f nr : r [, R and n [, N r ˆV f nr = a nr g τ ) H rt I) J) K) Regardless the analysis window that has been used, the quantity g τ ) changes its sign since g τ f k ) has several lobes. In order to obtain a satisfying expression i.e. a difference of two positive terms), the spectral support of g τ ) is limited to its main lobe so the sign of its derivative is changing once) and its derivative is expressed as: g τ ) = f nr f k ) g τ ). L) The quantity g τ f k f nr) f k f nr stays positive on the main lobe, for every kind of analysis window an illustration can be found in [). Thus, ˆV = a nr f k g τ ) H rt, f nr ˆV M) = a nr f nr g τ ) H rt. f nr And finally, by replacing in equation H), equations 8) and 9) from [ are obtained: P f nr ) = [ f k.g ) τ ) a nr.h rt. V β f nr.g ) τ ) + a nr.h rt. V β.v, Q f nr ) = [ f k.g ) τ ) a nr.h rt. V β.v f nr.g ) τ f f nr ) + a nr.h rt. V β.

6 For the Hanning window, the derivative of g τ f) for f [ /τ, /τ is given by: g τ f) = τ f ) sinπτf) + πτf τ f ) cosπτf) πτ f τ f )... C f nr, γ) partial derivative decompositions Derivative with respect to f nr : r [, R, n [, N r C ) = f nr nf r + Br n f nr Then equations ) and ) from [ are directly obtained: { P f nr ) = f nr, Q f nr ) = nf r + Br n. N) f n / n. x Peaks Estimated inh. relation 8 9 n Derivative with respect to B r : r [, R: C B r = n= = F r ) f nr nf r + Br n n= n F r ) n f nr + Br n n F r + B r n Thus, equations ) and ) from [ are obtained: P B r ) = F r n, Q B r ) = n= n= Derivative with respect to F r : r [, R: C F r = n= n f nr + Br n. O) ) f nr nf r + Br n n ) + B r n P) An exact analytic solution allows to cancel the partial derivative corresponding to the update rule ) from [): F r = n= N r f nr n + B r n n + B r n ) n=.. Performance evaluations The results for the estimation of the inharmonicity coefficient have been compared to the state-of-the-art PFD Partial Frequencies Deviation) algorithm [,, on both synthetic and real piano tones corresponding to these used in [). As suggested in [, the evaluation is performed on A-G range m [, ). Fig. 8: Inharmonicity coefficient ground truth extraction on note G of Iowa grand piano. Magnitude spectrum in grey), and selection of the peaks corresponding to transverse vibration of the strings + markers). Results of the estimation of B and F reference values in the plane f n /n) as a function of n. + markers correspond to manually extracted peaks, and grey line to the estimated inharmonicity relation.... Reference extraction As the ground truth i.e. the inharmonicity coefficient of each tone) is not directly available the string dimensions and properties are not known for the real piano, and the synthesis parameters are not available anymore for the synthetic tones), the partial frequencies corresponding to transverse vibrations of the strings have been manually extracted, up to a rank of, from isolated note magnitude spectra. The spectra have been computed from seconds of decaying sound on frequency bins. In case multiple partials are produced by the coupling of doublet/triplet of strings, several peaks have been considered as illustrated on figure 8). In the following equation, the frequencies of these peaks are denoted by f r,n,p, where r corresponds to the index of the note, p [, P r,n to the index of the peak within n [, N r the rank of the partial. Then, the reference parameters B r, F r) have been estimated by minimizing the least absolute deviation between the inharmonicity relation and the extracted partial frequencies: P r,n Br, Fr) = argmin B r,f r) n= p= P r,n f r,n,p nf r + Br n. Q) The /P r,n factor corresponds to a weighting that allows considering multiple partials as one theoretical partials in the regression. The result of the estimation of B, F ) for G note of Iowa grand piano is depicted in figure 8 by a graphic f n /n) as a function of n.

7 Relative error in %).... PFD NMF B log. scale) PFD NMF Relative error in %) 8 PFD NMF Fig. 9: Relative error in %) of the inharmonicity coefficient estimation along the range A-G m [, ) for PFD black) and NMF grey) algorithms. The evaluation is performed on synthetic and Iowa real piano tones. PFD %) NMF %) Synthetic.8. Iowa.. Table : Relative error in %) of the inharmonicity coefficient estimation, averaged on the range A-G, for PFD and NMF algorithms on synthetic and Iowa real piano tones. 8 9 Fig. : Inharmonicity coefficient estimation along the whole compass of Iowa grand piano. Black curve corresponds to PFD estimation and grey one to NMF. monicity and tuning of the piano. Journal of the Acoustical Society of America, ), May. [ M. Goto, T. Nishimura, H. Hashiguchi, and R. Oka. Rwc music database: Music genre database and musical instrument sound database. In ISMIR, pages 9,. [ Romain Hennequin, Roland Badeau, and Bertrand David. Time-dependent parametric and harmonic templates in non-negative matrix factorization. In Proc. of the th Int. Conf. on Digital Audio Effects DAFx-), September. [ J. Rauhala, H. M. Lehtonen, and V. Välimäki. Fast automatic inharmonicity estimation algorithm. JASA Express Letters, :8 89, May. [ J. Rauhala and V. Välimäki. F estimation of inharmonic piano tones using partial frequencies deviation method. In Proc. of International Computer Music Conference ICMC ), pages, August.... Evaluation results Finally, the inharmonicity coefficient estimation results for PFD and NMF algorithms are evaluated in terms of relative error with respect to the reference: B r B r /B r. These are presented on figure 9 for synthetic and real piano Iowa grand piano with mf dynamics) sample analysis on the range A-G m [, ). Table returns these relative errors averaged for each piano and algorithm and exhibits highest performances for the NMF algorithm. On G -C8 range m [, 88) results are not quantified because of the lack of ground truth and data for the synthetic signals. However, it can be observed graphically c.f. figure ) that NMF estimates seem more consistent with typical values than these obtained by using PFD not optimized there). It is worth noting that in the presented experiments, F r was initialized to equal temperament for PFD algorithm proposed as an optional input in the code). In a second time, in order to study the influence of the initialization, we modified the PFD code so that it can take into account the same initialization of B r as the one we used for the NMF algorithm. No improvement was found for the high pitch range.. REFERENCES [ François Rigaud, Bertrand David, and Laurent Daudet. A parametric model and estimation techniques for the inhar-

TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION

TIME-DEPENDENT PARAMETRIC AND HARMONIC TEMPLATES IN NON-NEGATIVE MATRIX FACTORIZATION 13 th International Conference on Digital Audio Effects Romain Hennequin, Roland Badeau and Bertrand David Telecom