# The Perception of Sound – Part 1

#### In order to understand the acoustics of any installation, it’s important to understand how sound is perceived and calculated.

A sound source oscillates and causes small pressure fluctuations in the surrounding air, gas or fluid, causing particles to start moving outwards. With the mass and compressibility of the air, the fluctuations are transmitted to the listener’s ear.

The pressure fluctuations are referred to as sound pressure *p*. The sound pressure is superimposed on the static atmospheric pressure *p _{0}* which depends on time and space. The sound source radiates a spatially distributed sound field with different instantaneous sound pressures at each moment. The observed sound incident at any point has two main distinguishing attributes – timbre and loudness.

The physical quantity for loudness (or amplitude) is sound pressure *p*, measured in* N/m²*. The physical quantity for timbre is frequency

*f*, measured in Hertz (Hz) or cycles per second,* 1Hz = 1/s.*

Our human hearing range starts at about 16Hz and ranges up to 16000Hz or 16kHz. The ultrasound is above that frequency range and the infrasound below it, both of them being of technical interest, too.

A sound incident that can be described by a sine curve in the time-domain is called a pure or harmonic tone. A harmonic tone can only rarely be observed in natural sound conditions.

Sound pressure p(N/m², eff.) | Sound pressure level L (dB) | Situation |

2 x 10^{-5} | 0 | hearing threshold |

2 x 10^{-4} | 20 | forest, slow winds |

2 x 10^{-3} | 40 | library |

2 x 10^{-2} | 60 | office |

2 x 10^{-1} | 80 | busy street |

2 x 10^{0} | 100 | pneumatic hammer, siren |

2 x 10^{1} | 120 | jet plane during take-off |

2 x 10^{2} | 140 | threshold of pain, hearing loss |

A sound incident that can be described by a sine curve in the time-domain is called a pure or harmonic tone. A harmonic tone can only rarely be observed in natural sound conditions.

The physical quantity for timbre is frequency f, measured in Hertz (Hz) or cycles per second, *1Hz = 1/s.*

Our human hearing range starts at about 16Hz and ranges up to 16000Hz or 16kHz. The ultrasound is above that frequency range and the infrasound below it, both of them being of technical interest, too.

A sound incident that can be described by a sine curve in the time-domain is called a pure or harmonic tone. A harmonic tone can only rarely be observed in natural sound conditions.

Even the sound of a musical instrument contains a superposition of several harmonic tones typical of the instrument. However, an arbitrary sound can be represented as a sum of harmonics, their frequencies and their amplitudes.

The frequency components – the acoustic spectrum – of a sound can be extracted through spectral analysis similar to the spectral analysis of light.

In terms of sound pitch, the human ear perceives the tonal difference of two pairs of tones f_{a¹}, f_{a²} and f_{b¹}, f_{b²} equally, if the ratio – not the difference – of the two pairs is equal, that is, if we have

$$\frac{{f}_{{a}^{1}}}{{f}_{{a}^{2}}}=\frac{{f}_{{b}^{1}}}{{f}_{{b}^{2}}}$$

So, for example, we perceive the transition from 100Hz to 125Hz and the transition from 1000Hz to 1250Hz as an equal change in pitch. This relative tonal impression is reflected in the subdivision of the scale into octaves – a doubling of frequency – and other intervals such as second, third, fourth and fifth, which is commonly used in music.

**Weber-Fechner Law**

It is not only this tonal impression that a stimulus *R* has to be increased by a certain percentage to be perceived as an equal change in perception. It is true for other human senses as well.

In mathematical terms, the increment of a perception *ΔE* is proportional to the ratio of the absolute increase of the stimulus *ΔR* and the stimulus *R*, so *ΔE = kΔR/R *where *k* is a proportionality constant.

Moving towards infinitesimally small variations *dE* and* dR*, integration yields *E=2.3 k* *lg (R/R _{0})* where lg is the logarithm to the base 10 and

*R*the threshold stimulus, at which the perception starts. This relation is known as the Weber-Fechner Law.

_{0 }The perception of loudness also follows the logarithmic Weber-Fechner Law, since the human ear is faced with the task of perceiving very quiet sounds, such as falling leaves in a quiet countryside, as well as very loud sounds, such as the roaring of a nearby waterfall.

Indeed, humans are able to perceive sound pressures from

$20\times {10}^{{}^{-}6}N/{m}^{2}$

to approximately *200 N/m²* where the upper limit is the human pain threshold.

Human hearing covers about seven orders of magnitude of loudness, which is an exceedingly large physical interval. It is therefore handy to use a logarithmic measure when quantifying sound pressure technically, instead of the physical sound pressure itself.

The sound pressure level* L* is defined as:

$$L=20\text{}lg(\frac{p}{{p}_{0}})=10\text{}lg(\frac{p}{{p}_{0}}{)}^{2}$$

where

${p}_{0}=20\times {10}^{{}^{-}6}N/{m}^{2}$

The reference value p_{0} corresponds to the hearing threshold at a frequency of *1kHz* as the hearing threshold depends on the frequency.

The specification decibel *dB* is not a unit but indicates the use of the logarithm to the base 10. The factor 20 is chosen in a way that corresponds to our perception – if two sounds differ by *1dB*, we just perceive a difference in loudness.

Assigning sound pressure levels maps the range of sound pressure covering seven powers of ten to a scale from 0 to 140dB.

It is remarkable that even the sound pressure of 200N/m2 related to the sound pressure level at the pain threshold is still only a small fraction (1/500) of the static atmospheric pressure of about 105N/m².

$$

Often, several sound pressure levels have to be summarised to one. Signals originating from sources independent from each other, such as different technical devices or machines or different speakers, are called incoherent signals. It is nearly always justified to assume incoherent signals.

It can be shown that the relevant squared root-mean-square value (rms-value) of the total signal is the sum of the individual squared rms-values. That is, the squared rms-value of *N* incoherent signals is given by

$${p}_{eff}^{2}=\sum _{i=1}^{n}{p}_{eff,i{}^{\ast}}^{2}$$

We gain the formula of level summation by expressing all N squared rms-values by their sound pressure levels L_{i}:

$${L}_{tot}=10\text{}lg\text{}(\frac{{p}_{eff}}{{p}_{0}}{)}^{2}\text{}=10\text{}lg\text{}\sum _{i=1}^{n}{10}^{\frac{L}{10}}$$

This formula means that the levels are in fact not summed. Instead, the individual levels must be transformed to the squared rms-values of the sound pressures. These then add up to give the total squared rms-value – and only then can we calculate the total sound pressure level L_{tot.}

For example, at a point of sound emission, there is already a sound pressure level 50dB. How high a sound pressure level L_{p} can be added, if the overall sound pressure level is not to exceed 55dB?

With:

${10}^{\frac{{L}_{tot}}{10}}={10}^{5.5}={10}^{\frac{LP}{10}}+{10}^{5},$

we can add

${L}_{p}=lg({10}^{5.5}-{10}^{5})=53.3dB$

**Octave and Third-Octave Band Filters**

Measurements of the spectral components of time domain signals are often realised using filters where the frequency range is subdivided into intervals. The filters are electronic circuits which let a supplied voltage pass only in a specific frequency band.

The filter is characterised by its lower and upper limiting frequency f_{l}and f_{u}, its bandwidth

$\mathrm{\Delta}f={f}_{u}-{f}_{l},$

and its centre frequency f_{c}.

For acoustic purposes, with constant relative bandwidth are used. Their bandwidth is proportional to the centre frequency of the filter. So withnincreasing centre frequency the bandwidth is also increasing. The centre frequency f_{c} of filters with constant relative bandwidth is the geometric mean of f_{l} and f_{u}, determined by

${f}_{c}=\sqrt{{f}_{l}{f}_{u}{}^{\ast}}$

The octave band filter and the third-octave filter are the main filters with a constant relative bandwidth. The octave bandwidth f_{u} = 2f_{l} results in

${f}_{c}=\sqrt{2{f}_{l}}\text{}\text{}and\text{}\mathrm{\Delta}f={f}_{{l}^{\ast}}$

The third-octave bandwidth

${f}_{u}=\sqrt[3]{2{f}_{l}}=1.26{f}_{l}$

results in

${f}_{c}=\sqrt[6]{2{f}_{l}}=1.12{f}_{l}\text{}\text{}and\text{}\mathrm{\Delta}f=0.26{f}_{l}$

Three adjacent third-octave band filters form an octave-band filter since

$\sqrt[3]{2}\sqrt[3]{2}\sqrt[3]{2}=2$

The limiting frequencies are standardised in EN 60651 and EN 60652. The centre frequencies of the third-octave filters are:

f_{c}=(1, 1.25, 1.6, 2, 3.16, 4,5, 6.3, 8)x10^{i}

The centre frequencies of the octave filters within the hearing range are 16Hz, 31.5Hz, 63Hz, 125Hz, 250Hz, 500Hz, 1kHz, 2kHz, 4kHz, 8kHz and 16kHz.

When measuring sound levels one must state which filters were used during the measurement. By using level summation, the levels of broader frequency bands may be calculated. The linear level is often given because it contains all attributes of the frequency range between 16Hz and 20kHz and can be either measured directly or determined by level summation.

### **Hearing Levels and A-weighting**

The sensitivity of the human ear depends on the tonal pitch. for example, a 100Hz tone of 70dB and a 1000Hz tone of 60dB are perceived with equal loudness. The ear is more sensitive in the middle frequency range than at very high or very low frequencies.

A frequency-weighted sound pressure level is used, which accounts for the basic aspects of the human ear’s sensitivity and can also be realised with reasonable effort, the so-called A-weighted sound pressure level dB(A). This is measured using the A-filter.

The A-filter roughly represents the inverse of the hearing level curve with 30dB at 1kHz.

The A-weighting function is standardised in EN 60651. For certain noise problems, especially for vehicle and aircraft noise, there are also less commin weighting functions B, C and D. The A-weighted sound pressure level of a broader interval can again be determined by adding up appropriate third-octave or octave A-weighted levels using level summation:

$$L(A)=10\text{}lg(\sum _{i=1}^{n}{10}^{\frac{({L}_{i}+\mathrm{\Delta}i)}{10}})$$

**In Summary**

Tests show that our perception is governed by relativity, where changes are perceived to be equal when the respective stimulus increases by the same percentage. The conclusion drawn is known as Weber-Fechner Law, according to which perception is proportional to the logarithm of the stimulus.

Therefore, physical sound pressures are expressed through their logarithmic counterparts using pressure levels of a pseudo-unit, the decibel dB, thus mapping the sound pressure range of seven powers of ten relevant to the human ear to a scal of 0 to 140dB.

A law of level summation allows adding up sound pressure levels of incoherent signals. Filters with constant relative bandwidth, mainly octave and third-octave filters, are used to measure spectral components of a signal. A-weighting roughly captures the frequency response function of human hearing. A-weighted sound pressure levels are expressed in the pseudo-unit dB(A).