Reverberation Impulse Response Analysis

MUS424: Signal Processing Techniques for Digital Audio Effects Handout #18 Jonathan Abel, David Berners April 27, 24 Lecture #9: April 27, 24 Lecture Notes 7 Reverberation Impulse Response Analysis 1 CCRMA Lobby Impulse Response.8.6 direct path.4 early reflections.2 -.2 late-field reverberation -.4 1 2 3 4 5 6 7 8 9 1 time - milliseconds Figure 1: CCRMA Lobby Impulse Response showing direct path, early reflections, and latefield reverberation components.

2 MUS424: Handout #18 Reverberation Impulse Response Components Direct Path. The direct path is the arrival of source signals along a straight-line path from the source. The arrival time and amplitudeare fixed by the sourcelistener distance; filtering dependent on the source direction and orientation are also present. Early Reflections. Typically a small number of specular reflections from environment surfaces will arrive at the listener well separated in time or amplitude from other reflected energy. These so-called early reflections convey a sense of the environment geometry and size. Late-Field Reverberation. After a period of time, source signals have interacted sufficiently with room objects and surfaces that the density of arrivals is great enough to be indistinguishible from noise. This late-field reverberation is further characterized by a frequency-dependent exponential decay determined by the environment size and materials present.

MUS424: Handout #18 3 Sabine Theory Space with volume V : perfectly reflecting surfaces non-absorbing air Window or area A. Consider what happens after the energy is thoroughly mixed, and the energy density w(t, x), initially a function of time t and position x, is only a function of time. What is w(t)?

4 MUS424: Handout #18 Room Mixing Time To generate m reflections, roughly speaking, source signals must propagate long enough that a sphere with radius equal to the propagation distance contains roughly m times the room volume, 4 3 π r3 mix = mv. If m reflections are needed to thoroughly mix a room, then the mixing time τ mix is roughly τ mix = r mix /c, r mix = ( 3 4π mv )1/3, where c is the speed of sound in air. For m = 1, typical dimension V 1/3 = 1 meters, the mixing time is roughly τ mix 2 milliseconds.

MUS424: Handout #18 5 Energy Change during t Compute the change in total energy during a small interval of time, t, assuming the energy density is constant throughout the volume V. The total energy at time t+ t is the energy density scaled by the volume, V w(t + t). The total energy at time t is similarly V w(t). The difference is what leaves through the aperture, V w(t + t) = V w(t) γ(w(t),a, t).

6 MUS424: Handout #18 Computing γ: The Two-Dimensional Case. Consider the 2D case. A small volume of energy a distance d from the aperture, and traveling at angle ϕ relative to the aperture normal, will escape during t if its direction of travel takes it through the aperture and d c t cosϕ, where c is the speed of sound. The total energy leaving along ϕ is then w(t) A c t cosϕ

MUS424: Handout #18 7 Computing γ: The Two-Dimensional Case. Under the assumption that the room energy is thoroughly mixed, the probability that a given bit of energy is traveling along ϕ is (ϕ) = 1 2π. The energy lost during t is γ(w(t),a, t) = π/2 π/2 = 1 w(t) A c. π w(t) A c t cosϕ (ϕ)dϕ

8 MUS424: Handout #18 Computing γ: Circular Aperture. The energy lost during t traveling along direction (ϕ, θ) is w(t) A c t cosϕ. As a result, the total energy lost during t is γ(w(t), A, t) = π/2 2π w(t) A c t cosϕ (ϕ,θ)dθdϕ, where (ϕ, θ) is the probability that any bit of energy is traveling along direction (ϕ, θ).

MUS424: Handout #18 9 Computing γ: Circular Aperture. Assuming all directions on the sphere are equally likely, (ϕ,θ) = sinϕ 4π. The energy lost during t is therefore γ(w(t), A, t) w(t) A c t = 4π w(t) A c t = 2 w(t) A c t =. 4 π/2 2π π/2 cosϕ sinϕdθ dϕ, cosϕ sin ϕdϕ,

1 MUS424: Handout #18 1.9.8.7 w_ exp(-t / \tau).6.5.4.3.2.1.5 1 1.5 2 2.5 3 time / decay time Energy Density Behavior. By conservation of energy, V w(t + t) = V w(t) gcaw(t) t, where g is a geometric factor. Rearranging terms, w(t + t) w(t) = gca t V w(t), and taking the limit t, we have dw dt = 1 τ w(t), τ = V gca, where τ is the so-called characteristic decay time. For a well mixed room with w(t = ) = w, w(t) = w e t/τ.

MUS424: Handout #18 11-1 -2 1 log_{1}(w_ exp(2 ln(.1) t / T_{6}) -3-4 -5-6.1.2.3.4.5.6.7.8.9 1 time / T_{6} Energy Density Behavior. The energy density follows an exponential decay to zero, w(t) = w e t/τ, τ = V gca. The time constant is proportional to room volume, and inversely proportional to window area. The reverberation time or T 6 is the time it takes the energy density to decay to 6 db below its initial value, We have w(t) = w e t/τ = w e 2ln(.1)t/T 6. T 6 = 2ln(.1) V gca (.161 S/m)V A.

12 MUS424: Handout #18 Multiple Windows. In the presence of multiple noninteracting windows, the energy losses add, V w(t + t) = V w(t) γ(w(t), t,a 1,...,A N ), and γ = ( A i )g c w(t). i The reverberation time is accordingly reduced, V T 6 (A i,i = 1,...,N) = 2ln(.1) gc i A. i

MUS424: Handout #18 13 Materials Patches vs. Windows. Absorption Coefficients, S material frequency 125 25 5 1 2 4 marble.1.1.1.1.2.2 brick.3.3.3.4.5.7 concrete block.36.44.31.24.39.25 plywood.28.22.17.9.1.11 cork.14.25.4.25.34.21 glass window.35.25.18.12.7.4 drapery.1.25.46.6.56.52 carpet.2.6.14.37.66.65 hardwood.15.11.1.7.6.7 grass.11.26.6.69.92.94 Room surfaces typically absorb a portion of impinging acoustic energy. The portion depends on the surface material and is a function of frequency. To compute the reverberation time associated with a patch of material, its area a is scaled by its absorbing power S, A effective = a S. Note that S = 1 for an open window, and S = for a perfectly reflecting wall. The reverberation time is then V T 6 = 2 ln(.1) gc i a. is i

14 MUS424: Handout #18 1 1 T_{6} - 6-dB decay time, various low-frequency absorption settings. 1 1-1 1-1 1 1 1 frequency - khz Frequency-Dependent T 6. The EMT14 Plate Reverberator provides control over low-frequency reverberation time.

MUS424: Handout #18 15 1 3 ANSI standard air absorption at 25 degrees C and 5% relative humidity 1 2 1 1 1 1-1 1 frequency - khz 1 1 Air Absorption. Air absorbs sound at a frequency-dependent rate proportional to the distance traveled. During a time interval t, every unit volume of energy density experiences the same attenuation, exp{ α(ω) t}.

16 MUS424: Handout #18 Air Absorption. In a room of volume V with uniform energy density w(t), the total energy absorbed via air absorption during the time interval t is γ air = V w(t) [1 exp{ α(ω) t}], V w(t) α(ω) t, assuming t 1. Again, by conservation of energy, V w(t + t) = V w(t) γ surfaces γ air, and, substituting for the absorbed energy terms, ( ) V w(t+ t) = V w(t) gc i A i + V α(ω) Rearranging, and taking the interval t to zero, ( dw gc dt = i A ) i + α(ω) w(t). V Again, we have an exponential decay, this time with characteristic decay time τ = V gc i A i + V α(ω). w(t) t.

MUS424: Handout #18 17 Reverberation Time as a Function of Room Size. Consider a room with volume V, absorbing materials with effective areas A i, and characteristic dimension l = V 1/3. The reverberation time is 1 T 6 (ω) = 2 ln(.1) gc i A i/v + α(ω), 1 = 2 ln(.1) gcσ(ω)/l + α(ω), where the σ(ω) is the sum of effective areas, normalized by l 2. Note that the reverberation time is the harmonic mean of the materials and air reverberation times, [ ] 1 1 T 6 (ω) = T 6 surfaces (ω) + 1 T 6 air (ω). This means that in any frequency band, the shorter reverberation times dominate. Also, air absorption provides an upper limit on reverberation time as a function of frequency. As the room gets larger, the contribution of the air becomes more important: Larger rooms will have longer reverberation times with darker tails.

18 MUS424: Handout #18.5 impulse response -.5 1 2 3 4 5 6 response energy profile, 25-msec. frames. -1-2 -3-4 -5-6 1 2 3 4 5 6 time - milliseconds Impulse Response Energy Envelope. Reverberation impulse responses are nonstationary noise processes with exponentially decaying variances. How do you estimate the variance of a noise process? You average sample variances: For an averaging width β, the smoothed energy envelope of an impulse response h(t) is P(t;β) = 1 β t+β/2 n=t β/2 h 2 (n).

MUS424: Handout #18 19 Impulse Response Energy Envelope. In computing the energy envelope, P(t; β) = 1 β t+β/2 n=t β/2 h 2 (n), it is important to pick the averaging window to be wide enough to suppress noise, but not so wide that the estimate is biased. For a late-field decay, E { h 2 (t) } = w exp( t/τ). In a neighborhood of t, t [t β/2,t + β/2] we have exp( t/τ) exp( t /τ) (1 t/τ). provided β τ. Accordingly, when β τ. E{P(t;β)} = w exp( t/τ)

2 MUS424: Handout #18 response power spectrum, 46-msec. interval between frames. -2-4 -6-8 1-1 1 frequency - khz 1 1 Energy Decay Relief. -2-4 -6-8 1-1 1 1 1 frequency - khz Impulse Response Time-Frequency Analysis. STFT, P(ω,t). Energy Decay Relief (EDR), EDR(ω,t) = Loudness Spectrogram. t P(ω,τ)dτ.

MUS424: Handout #18 21 Impulse Response Time-Frequency Analysis. response spectra, 25-msec. frames. power - db 6 5 5 1 4 15 3 2 2 25 1 3 5 1 15 2 25 frequency - Bark

22 MUS424: Handout #18 Impulse Response Spectrogram Model. response power spectrum, Bark-spaced frequencies. -1-2 -3-4 -5-6 -7-8 -9-1 2 4 6 8 1 12 14 time - milliseconds The smoothed measured late-field spectrogram is modeled as a nonstationary noise process with mean E { PH (ω,t) } = q(ω) 2 exp( 2t/τ(ω)) + P N (ω), where q(ω) is the magnitude of the initial late-field equalization, τ(ω) is the late-field decay rate, and P N (ω) is the measurement noise spectrum.

MUS424: Handout #18 23 Equalization and Reverberation Time Estimation. measured, modeled response energy profile -2-4 -6-8 -1-12 2 4 6 8 1 12 14 time - milliseconds Estimate q(ω), τ(ω) separately for each frequency. (These estimates form sufficient statistics if they can be accurately estimated.) Note that the db initial equalization and reverberation time are linear in the db smoothed power spectrum. P H (ω,t) = q(ω) 2 exp( 2t/τ(ω)) + ν(ω, t), Denotingby P H (ω,t) the model q(ω) 2 exp( 2t/τ(ω)), η H (ω,t) = 1log 1 {P H (ω,t) [1 + ν(ω,t)/p H (ω,t)]}. Assuming small measurement noise, η H (ω,t) 1log 1 P H (ω,t) + ν(ω, t)/p H (ω, t).

24 MUS424: Handout #18 Equalization and Reverberation Time Estimation. Denote by θ the unknown equalization and decay rate, [ ] ln q(ω) θ =, 1/τ(ω) and by η θ the stack of hypothesized band powers for a given equalization and decay rate, ln q(ω) t /τ(ω) η θ =.. ln q(ω) t N /τ(ω) Given noisy measurements, we have the following equation η(ω,t i ) = η θ (ω,t i ) + ɛ(ω,t i ) at a set of times t i, choose the equalization magnitude and decay rate at the ones minimizing the sum of square equation errors, ˆθ = argmin θ J(θ), J(θ) = [η θ η] [η θ η].

MUS424: Handout #18 25 Equalization and Reverberation Time Estimation. measured, modeled response energy profile -2-4 -6-8 -1-12 2 4 6 8 1 12 14 time - milliseconds Denoting by B the basis B = [ 1 t ], the equalization and decay rate estimate is ˆθ = (B B) 1 B η, with the estimated db energy as a function of time being the projection onto B of the measured db band power as a function of time, ˆη = Bˆθ = B(B B) 1 B η.

26 MUS424: Handout #18 Estimate Statistics. measured, modeled response energy profile -2-4 -6-8 -1-12 2 4 6 8 1 12 14 time - milliseconds Assuming the model is valid, and the equation error zero mean, the estimate is unbiased with variance } { Var {ˆθ = (B B) 1 B E [ η η ˆθ ][ η η ˆθ ] } B(B B), = σ 2 (B B) 1, in the case of an i.i.d. equation error column, { E [ η η ˆθ ][ η η ˆθ ] } = σ 2 I.

MUS424: Handout #18 27 Impulse Response Histrograms. 25 early impulse response histogram 2 15 1 5 -.2 -.15 -.1 -.5.5.1.15.2 response amplitude 6 5 4 3 2 1 late impulse response histogram -.15 -.1 -.5.5.1.15 response amplitude Any segment of the late field is approximately colored Gaussian noise.

28 MUS424: Handout #18 The Central Limit Theorem. The sum of a sufficient number of i.i.d. random variables is approximately Gaussian. Rationale: The sum of two independent random variables is the convolution of their probability densities. Defining the moment generating function µ(t) of a probability density (x) as the Fourier Transform of the probability density, the sum of n i.i.d. random variables has its moment generating function µ n (t). Because the probability density is real, positive and integrates to one, the moment generating function is approximately µ(t) (1 2σ 2 t 2 ) in a small neighborhood about t =. Raising µ(t) to the Nth power, we see that it and therefore its generating density is approximately Gaussian, µ(t) n (1 (2nσ 2 t 2 )/n) n exp{ 2nσ 2 t 2 }.

MUS424: Handout #18 29 Echo Density Profile. 1 impulse response.5 -.5 2 4 6 8 1 12 14 16 18 2 time - milliseconds echo density profile, 2-msec. frames. 1.8.6.4.2 2 4 6 8 1 12 14 16 18 2 time - msec. A simple measure of impulse response echo density as a function of time is the percentage of samples in a frame which are outside a standard deviation from the mean. In the late field, this measure is expected to be that of a Gaussian, about 3%. In the presence of early reflections, the measure is expected to be lower, as the energy (and therefore standard deviation) is concentrated in the reflections.