Last time: small acoustics Voice, many instruments, modeled by tubes Traveling waves in both directions yield standing waves Standing waves correspond to resonances Variations from the idealization give the variety of speech sounds, musical timbre
This time: large (room) acoustics Resonances still a factor Other points about time domain characteristics Wave propagation, energy Some properties of hearing
Environment source shaping radiation
Spherical wave equation In polar coordinates: ( 2 p/ r 2 ) + 2/r ( p/ r) = 1/c 2 ( 2 p/ t 2 ) A solution is: p(r, t) = P 0 exp[j( t kr)]/r = P 0 exp[j(2 / )( ct r)]/r Where k = 2 / and = kc = 2 c/ (k is the wavenumber, is the wavelength) So p is inversely proportional to r
Sound waves c = (331.4 + 0.6T) m/s, or 343.4 m/s at 20 C 27 ft/s A little less than ms per foot Wavelength = c/f Actual velocity also depends on humidity
Intensity Definition: sound energy flowing across a unit area surface in a second I = p 2 / ( 0 c) Where p is the average pressure, 0 is the medium density, c is the speed of sound (as before), and 0 c is the characteristic impedance. So, for spherical wave, I is inversely proportional to r 2 (think of surface area for sphere)
Sinusoidal Intensity
db sound levels We choose ref values to correspond to typical threshold of hearing at khz And the corresponding db levels are Intensity Level (IL) and Sound Pressure Level (SPL)
SPL for typical sound sources 16 inch distant, hemispherical radiation, about m 2 surface area Whispered speech: nw power, 30 db SPL Average speech: 0 W power, 70 db SPL Loud speech: 200 power, 83 db SPL Shouting: mw power, 90 db SPL Ignoring boundaries, SPL would be 6 db lower for every doubling of distance
SPL is not loudness Psychoacoustics -> cube root approximation Frequency dependencies Weighting curves for measurement
Equal Loudness curves
Room modes Same math as for strings, tubes Standing waves at resonant frequencies Mostly a significant issue at low frequencies (given the large dimensions of rooms) Essentially a continuum at high frequencies
Acoustic reverberation Reflection vs absorption at room surfaces Effects tend to be more important than room modes for speech intelligibility Also very important for musical clarity, tone
Energy growth Assume sound source (e.g., noise) with power W turns on at t = 0 Assume it is uniformly distributed and diffuse _ Let be the average absorption, _ S be the surface area in the room, and S be the total absorption _ S = s + s 2 2 + _ Intensity = (W/S )( - exp [-t/ ]) where _ = 4V / cs (V is the room volume, c is speed of sound)
Steady state intensity for large t _ Intensity = Power/S = Power / Total absorption
Reverberation time When steady state signal is turned off, overall energy decays roughly exponentially Time to decay by 60 db from steady state after turnoff = RT60 Sabine formula, RT60 = 0.049V/S in feet, RT60 = 0. 63V/S in meters Frequency dependent
Air effects Additional denominator term RT60 = 0.049V/(S + 4 mv) Air effects dominate at very high frequencies, negligible at low frequencies
Example: 237 Cory Hall Dimensions: 20 x 24 x 16 (feet) Volume: 7680 ft 3 Surface area: 2368 ft 2 Mid-freq RT60:.2 sec (empty) We infer average of. 3 Air absorption less than 10% for khz, 50% for 4 khz
Steady state, 237 Cory For a 0 W source, Intensity = Power/(S ) = 3.2 x 0-8 W/ft 2 = 3.5 x 0-7 W/m 2 This is about 55 db IL, and 20 ms after cutoff, the IL will be about 49 db. Will this effect interfere with intelligibility?
Losses at boundaries Sound energy reaching boundary can do 3 things: Reflect Transform into heat Pass through boundary The energy fraction for the last two constitutes the absorption coefficient
Absorption coefficients, common building materials 125 Hz 500 Hz 2000 Hz Material Acoustic paneling 0.16 0.50 0.80 Acoustic plaster 0.30 0.50 0.55 Brick wall, unpainted 0.02 0.03 0.05 Draperies, light 0.04 0.11 0.30 Draperies, heavy 0.10 0.50 0.82 Felt 0.13 0.56 0.65 Floor, concrete 0.01 0.02 0.02 Floor, wood 0.06 0.06 0.06 Floor, carpeted 0.11 0.37 0.27 Glass 0.04 0.05 0.05 Marble or glazed tile 0.01 0.01 0.02 Plaster 0.04 0.05 0.05 Wood paneling, pine 0.10 0.10 0.08
Effect on intelligibility Energy is larger than without reverberation Decay stretches sounds out in time Colorations due to frequency-dependent absorption Spectral measures will be different
Intensity level in a live room as produced by successively spoke syllables
The phrase two oh six convolved with impulse response from.5 second RT60 room
Initial time delay gap = t1
Measuring room responses Impulsive sounds Correlation of mic input with random signal source (since R(x,y) = R(x,x) * h(t) ) Chirp input Also includes mic, speaker responses No single room response (also not really linear)
Effects of reverberation Increases loudness Early loudness increase helps intelligibility Late loudness increase hurts intelligibility When noise is present, ill effects compounded Even worse for machine algorithms
Effects of reverberation (2) Reverb acts like a smearing of the spectrogram Spectrum of the short-term spectral components is the modulation spectrum Reverberation acts like a low-pass filter on the modulation spectrum Treatment of the modulation spectrum can help ASR
Effects of reverberation: Word error rate for ASR Close-mic ed speech Human 0.3% 0.3% ASR 5.9% 22.2% ASR w/ treatment 4.7% 3.0% RT60 =.55s, distance mic Numbers 95 corpus; treatment corresponds to modulation spectral features
Dealing with reverberation Microphone arrays - beamforming Reducing effects by subtraction/filtering Stereo mic transfer function Using robust features (for ASR especially) Statistical adaptation
Artificial reverberation Physical devices (springs, plate, etc.) Simple electronic delay with feedback FIR for early delays (think of initial time delay gap in concert halls), IIR for later decay Explicit convolution with stored response
Some factors affecting concert hall design (Beranek) Reverberation time Envelopment (diffusion) Clarity
Reverberation time 500Hz/ khz average, RT60, ~2 s for concert halls ~.4 s for opera houses want ~80-90% at 2 khz/4 khz ( brilliance ) want 0-25% at 25/250 Hz ( warmth ) Based on subjective assessments of concert masters
Envelopment Impression of sound surrounding you Requires adequate diffusion (reflections everywhere)
Clarity Ratio of energy in early sound to later sound C 80 corresponds to first 80 ms after direct; typically want 0 to -4 db Typically want a short initial time delay gap e.g., 5 ms or less
Some Large Acoustics Conclusions Frequency-domain resonances only a part of the effects; time-domain reflections important Short-time reverberation aids in intelligibility, affects spectrum, related to room geometry Long-time reverberation acts like exponential decay of signals, can yield overlapping speech Factors beyond reverberation time are important for high quality acoustic spaces
What s next: Homework due Wednesday on chap. 9, 0, 4 questions : Provide answers to problems (9.2, 9.4, 10.1, 10.5, 14.4, 14.5) and answer the question: "Describe phase locking in the auditory nerve. Over roughly what frequencies does this take place? " Wednesday talk is by Prof. Keith Johnson (linguistics) on speech sound categories