Last time: small acoustics - PDF Free Download

Last time: small acoustics Voice, many instruments, modeled by tubes Traveling waves in both directions yield standing waves Standing waves correspond to resonances Variations from the idealization give the variety of speech sounds, musical timbre

This time: large (room) acoustics Resonances still a factor Other points about time domain characteristics Wave propagation, energy Some properties of hearing

Environment source shaping radiation

Spherical wave equation In polar coordinates: ( 2 p/ r 2 ) + 2/r ( p/ r) = 1/c 2 ( 2 p/ t 2 ) A solution is: p(r, t) = P 0 exp[j( t kr)]/r = P 0 exp[j(2 / )( ct r)]/r Where k = 2 / and = kc = 2 c/ (k is the wavenumber, is the wavelength) So p is inversely proportional to r

Sound waves c = (331.4 + 0.6T) m/s, or 343.4 m/s at 20 C 27 ft/s A little less than ms per foot Wavelength = c/f Actual velocity also depends on humidity

Intensity Definition: sound energy flowing across a unit area surface in a second I = p 2 / ( 0 c) Where p is the average pressure, 0 is the medium density, c is the speed of sound (as before), and 0 c is the characteristic impedance. So, for spherical wave, I is inversely proportional to r 2 (think of surface area for sphere)

Sinusoidal Intensity

db sound levels We choose ref values to correspond to typical threshold of hearing at khz And the corresponding db levels are Intensity Level (IL) and Sound Pressure Level (SPL)

SPL for typical sound sources 16 inch distant, hemispherical radiation, about m 2 surface area Whispered speech: nw power, 30 db SPL Average speech: 0 W power, 70 db SPL Loud speech: 200 power, 83 db SPL Shouting: mw power, 90 db SPL Ignoring boundaries, SPL would be 6 db lower for every doubling of distance

SPL is not loudness Psychoacoustics -> cube root approximation Frequency dependencies Weighting curves for measurement

Equal Loudness curves

Room modes Same math as for strings, tubes Standing waves at resonant frequencies Mostly a significant issue at low frequencies (given the large dimensions of rooms) Essentially a continuum at high frequencies

Acoustic reverberation Reflection vs absorption at room surfaces Effects tend to be more important than room modes for speech intelligibility Also very important for musical clarity, tone

Energy growth Assume sound source (e.g., noise) with power W turns on at t = 0 Assume it is uniformly distributed and diffuse _ Let be the average absorption, _ S be the surface area in the room, and S be the total absorption _ S = s + s 2 2 + _ Intensity = (W/S )( - exp [-t/ ]) where _ = 4V / cs (V is the room volume, c is speed of sound)

Steady state intensity for large t _ Intensity = Power/S = Power / Total absorption

Reverberation time When steady state signal is turned off, overall energy decays roughly exponentially Time to decay by 60 db from steady state after turnoff = RT60 Sabine formula, RT60 = 0.049V/S in feet, RT60 = 0. 63V/S in meters Frequency dependent

Air effects Additional denominator term RT60 = 0.049V/(S + 4 mv) Air effects dominate at very high frequencies, negligible at low frequencies

Example: 237 Cory Hall Dimensions: 20 x 24 x 16 (feet) Volume: 7680 ft 3 Surface area: 2368 ft 2 Mid-freq RT60:.2 sec (empty) We infer average of. 3 Air absorption less than 10% for khz, 50% for 4 khz

Steady state, 237 Cory For a 0 W source, Intensity = Power/(S ) = 3.2 x 0-8 W/ft 2 = 3.5 x 0-7 W/m 2 This is about 55 db IL, and 20 ms after cutoff, the IL will be about 49 db. Will this effect interfere with intelligibility?

Losses at boundaries Sound energy reaching boundary can do 3 things: Reflect Transform into heat Pass through boundary The energy fraction for the last two constitutes the absorption coefficient

Absorption coefficients, common building materials 125 Hz 500 Hz 2000 Hz Material Acoustic paneling 0.16 0.50 0.80 Acoustic plaster 0.30 0.50 0.55 Brick wall, unpainted 0.02 0.03 0.05 Draperies, light 0.04 0.11 0.30 Draperies, heavy 0.10 0.50 0.82 Felt 0.13 0.56 0.65 Floor, concrete 0.01 0.02 0.02 Floor, wood 0.06 0.06 0.06 Floor, carpeted 0.11 0.37 0.27 Glass 0.04 0.05 0.05 Marble or glazed tile 0.01 0.01 0.02 Plaster 0.04 0.05 0.05 Wood paneling, pine 0.10 0.10 0.08

Effect on intelligibility Energy is larger than without reverberation Decay stretches sounds out in time Colorations due to frequency-dependent absorption Spectral measures will be different

Intensity level in a live room as produced by successively spoke syllables

The phrase two oh six convolved with impulse response from.5 second RT60 room

Initial time delay gap = t1

Measuring room responses Impulsive sounds Correlation of mic input with random signal source (since R(x,y) = R(x,x) * h(t) ) Chirp input Also includes mic, speaker responses No single room response (also not really linear)

Effects of reverberation Increases loudness Early loudness increase helps intelligibility Late loudness increase hurts intelligibility When noise is present, ill effects compounded Even worse for machine algorithms

Effects of reverberation (2) Reverb acts like a smearing of the spectrogram Spectrum of the short-term spectral components is the modulation spectrum Reverberation acts like a low-pass filter on the modulation spectrum Treatment of the modulation spectrum can help ASR

Effects of reverberation: Word error rate for ASR Close-mic ed speech Human 0.3% 0.3% ASR 5.9% 22.2% ASR w/ treatment 4.7% 3.0% RT60 =.55s, distance mic Numbers 95 corpus; treatment corresponds to modulation spectral features

Dealing with reverberation Microphone arrays - beamforming Reducing effects by subtraction/filtering Stereo mic transfer function Using robust features (for ASR especially) Statistical adaptation

Artificial reverberation Physical devices (springs, plate, etc.) Simple electronic delay with feedback FIR for early delays (think of initial time delay gap in concert halls), IIR for later decay Explicit convolution with stored response

Some factors affecting concert hall design (Beranek) Reverberation time Envelopment (diffusion) Clarity

Reverberation time 500Hz/ khz average, RT60, ~2 s for concert halls ~.4 s for opera houses want ~80-90% at 2 khz/4 khz ( brilliance ) want 0-25% at 25/250 Hz ( warmth ) Based on subjective assessments of concert masters

Envelopment Impression of sound surrounding you Requires adequate diffusion (reflections everywhere)

Clarity Ratio of energy in early sound to later sound C 80 corresponds to first 80 ms after direct; typically want 0 to -4 db Typically want a short initial time delay gap e.g., 5 ms or less

Some Large Acoustics Conclusions Frequency-domain resonances only a part of the effects; time-domain reflections important Short-time reverberation aids in intelligibility, affects spectrum, related to room geometry Long-time reverberation acts like exponential decay of signals, can yield overlapping speech Factors beyond reverberation time are important for high quality acoustic spaces

What s next: Homework due Wednesday on chap. 9, 0, 4 questions : Provide answers to problems (9.2, 9.4, 10.1, 10.5, 14.4, 14.5) and answer the question: "Describe phase locking in the auditory nerve. Over roughly what frequencies does this take place? " Wednesday talk is by Prof. Keith Johnson (linguistics) on speech sound categories