Auditory masking is when the perception of one sound is affected by the presence of another sound (Gelfand 2004). The term masking is not confined to auditory perception as it can also be used in visual perception tasks.

Contents

Simultaneous masking

Simultaneous masking is when a signal, the sound that is desired to be heard, is made inaudible by a masker, noise or unwanted sound that is present throughout the signal (Moore 2004).

A not masked threshold is the quietest level of the signal which can be perceived in quiet. Masked thresholds are the quietest level of the signal perceived when presented in noise. The amount of masking is the difference between the masked and not masked thresholds (Gelfand 2004).
For example if the masked threshold is 35dB and the not masked threshold is 20dB the amount of masking would be 15dB. This is illustrated in figure A.

The basic masking test involves the not masked thresholds being measured on a subject. Then the masking noise is introduced at a fixed sound level and the signal is presented at the same time. The level of the signal is varied until the new threshold is measured. This is the masked threshold (Gelfand 2004).

The phenomenon of masking is often used to investigate the auditory system’s ability to separate the components of a complex sound. For example if two sounds of two different frequencies (pitches) are played at the same time, two separate sounds can often be heard rather than a combination tone. This is otherwise known as frequency resolution or frequency selectivity. Frequency resolution is thought to occur due to filtering within the cochlea, the hearing organ in the inner ear. A complex sound is split into different frequency components and these components cause a peak in the pattern of vibration at a specific place on the basilar membrane within the cochlea. These components are then coded independently on the auditory nerve which transmits sound information to the brain. This individual coding only occurs if the frequency components are different enough in frequency, otherwise they are coded at the same place (Moore 1986).

These filters are called auditory filters or listening channels and it is thought that they line up along the basilar membrane, overlapping. Frequency resolution is said to occur on the basilar membrane due to the listener choosing a filter which is centred over the frequency they want to hear, the signal frequency. A sharply tuned filter has good frequency resolution as it allows the centre frequencies through but not other frequencies (Pickles 1982). Damage to the cochlea and the outer hair cells in the cochlea cause reduced sharpness of tuning (Moore 1986). This explains why someone with a hearing loss due to cochlea damage would have more problems than a normal hearing person with frequency selectivity. This would cause them, for example, to have difficulties distinguishing between different consonants in speech (Moore 1995).

Masking illustrates the limits of frequency selectivity even in a normal hearing person. If a signal is masked by a masker with a different frequency to the signal then the auditory system was unable to distinguish between the two frequencies. Therefore by carrying out an experiment to see the conditions which are necessary for one sound to mask a previously heard signal, the frequency selectivity of the auditory system can be investigated (Moore 1998).

Effect of frequency on masking patterns

How effective the masker is at raising the threshold of the signal depends on the frequency of the signal and the frequency of the masker.
The graphs in figure B are a series of masking patterns, otherwise known as masking audiograms adapted from findings by Ehmer (Gelfand 2004). Each graph shows the amount of masking produced at each masker frequency shown at the top corner, 250, 500, 1000 and 2000Hz. For example, in the first graph the masker is presented at a frequency of 250Hz at the same time as the signal. The amount the masker increases the threshold of the signal is plotted and this is repeated for different signal frequencies, shown on the X axis. The frequency of the masker is kept constant. The masking effect is shown in each graph at various masker sound levels.

Figure B shows along the Y axis the amount of masking- so how much the not masked threshold in quiet is raised to get the masked threshold in noise. The X axis shows the frequency of the signal. You can see that the greatest masking is at the centre frequency, when the masker and the signal are the same frequency, and this decreases as the signal moves further away from the masker frequency (Gelfand 2004). This phenomenon is called on-frequency masking and occurs because the masker and signal are within the same auditory filter (figure C). This means that the auditory system can not distinguish between them and so the signal is masked (Gelfand 2004).

Off-frequency masking is when the signal and masker are at different frequencies (figure D.)

The amount the masker raises the threshold of the signal is much less in off frequency masking. You can see from figure E however, that it does have some masking effect because some of the masker overlaps into the auditory filter of the signal (Moore 1998).

You can also see from figure B that the masking pattern changes depending on the frequency of the masker and the intensity. You can observe from the 1000Hz graph that for low levels e.g. 20-40 dB the curve is relatively symmetrical. As the masker intensity increases the curves become wider with greater masking particularly to signals at a frequency higher than the masker (Gelfand 2004). This shows that there is a spread of the masking effect upward in frequency as the intensity of the masker is increased. The curve is much shallower in the high frequencies than in the low frequencies and this is termed upward spread of masking. This means that a sound (masker) masks high frequency signals much better than low frequency signals (Gelfand 2004).

You can also observe from figure B that as the masker frequency increases, the masking patterns become increasingly compressed. This demonstrates that high frequency maskers are only effective over a narrow range of frequencies, close to the masker frequency. Low frequency maskers on the other hand are effective over a wide frequency range (Gelfand 2004). This is due to particular patterns of activity on the basilar membrane.

As mentioned before, masking experiments reveal information about the frequency selectivity of the ear and the listening channels/auditory filters which are used to distinguish between one frequency and another. Fletcher carried out an experiment to discover how much of a band of noise contributes to the masking of a tone. He carried out a masking experiment whereby a fixed tone signal had various bandwidths of noise centred on it. The masked threshold was recorded for each bandwidth. His research showed that there is a critical bandwidth of noise which causes maximum masking effect and energy outside this critical band does not have an influence on the masking effect. This can be explained by the auditory system having an auditory filter which is centred over the frequency of the tone. The bandwidth of the masker which is within this auditory filter effectively masks the tone but the rest of the masker which is outside the filter has no effect (figure G.)

This is used in Mp3 files to reduce the size of audio files. Parts of the signals which are outside the critical bandwidth are cut out leaving only the parts of the signals which are perceived by the listener (Sellars 2000).
Another application of auditory masking in everyday situations is the cocktail party effect.

Effects of intensity

Varying intensity levels can also have an effect on masking. It has been
found by many scientists that the lower skirt of the filter becomes less
sharp with increasing intensity level, whereas the higher skirt becomes
slightly steeper (Moore 1998). Changes in slope of the high frequency
side of the filter with level are less consistent than low frequency. At the
medium frequencies (1-4kHz) the the slope increases with increasing
level, but at the low frequencies there is no clear inclination with level and
the filters at high centre frequencies show a small decrease in slope with
increasing level (Moore 1998). The sharpness of the filter depends on
the input level and not the output level to the filter and the lower side
of the auditory filter also broadens with increasing level (Moore 1998). These
observations are illustrated in figure H.

Another condition of masking is contralateral simultaneous masking. This
condition of masking refers to the instance where the signal might be
audible in the non-test ear but is deliberately taken away by applying a
masker to the non-test ear. The last condition of masking is central
masking. This refers to the case where a masker causes a threshold
elevation. This can be in the absence of, or additional to, any
ipsilateral, contralateral or cross-masking effect and is due to interactions within the central nervous system between the separate neural inputs obtained from the masker and the signal (Gelfand 2004).

Effects of different stimulus types

Experiments have been carried out to see the different masking effects when using a masker which is either in the form of a narrowband noise or a sinusoidal tone.

It has been found that when a sinusoidal signal and a sinusoidal masker (tone) are presented simultaneously the envelope of the combined stimulus fluctuates in a regular pattern- this is described as beats. The difference between the frequencies of the two sounds equals the rate that the fluctuations occur. If the frequency difference is small then the sound is perceived as a periodic change in the loudness of a single tone, if the beats are fast then this can be described as a sensation of roughness. When there is a large frequency separation, the two components are heard as separate tones without roughness or beats. Beats can be a cue to the presence of a signal even when the signal itself is not audible. The influence of beats can be reduced by using a narrowband noise rather than a sinusoidal tone for either signal or masker.
(Moore 1986)

Ipsilateral, contralateral and central masking

Masking can be carried out in several different conditions. One of these being Ipsilateral, simultaneous masking which refers to the instance where masker and maskee are both delivered to the test ear at the same time. This can be both on-frequency or off-frequency (Gelfand 2004).

Another condition of masking is contralateral simultaneous masking. This condition of masking refers to the instance where the signal might be audible in the non-test ear (through transcranial conduction) but is deliberately obliterated by applying a masker to the non-test ear. The last condition of masking is central masking. This refers to the case where a masker causes a threshold elevation (makes a previously heard signal inaudible) in the absence of, or additional to, any ipsilateral, contralateral or cross-masking effect. It is due to interactions within the central nervous system between the separate neural inputs derived from the masker and the signal (Gelfand 2004).

Mechanisms of masking

There are many different mechanisms of masking, one being suppression. This is when there is a reduction of a response to a signal due to the presence of another. This happens because the original neural activity caused by the first signal is reduced by the neural activity of the other sound (Oxenham et al 1998).

Addition is the adding of several maskers to result in an increased final masker threshold greater than the original maskers (Lincoln 1998).

Combination tones are products of a signal/s and a masker/s. This happens when the two sounds interact causing new sound, which can be more audible than the original signal. This is caused by the non linear distortion that happens in the ear (Moore 1986).

For example, the combination tone of two maskers can be a better masker than the two original maskers alone (Moore 1986).

The sounds interact in many ways depending on the difference in frequency between the two sounds. The most important two are cubic difference tones and quadratic difference tones (Moore 1986).

Cubic difference tones are calculated by the sum

2F1 – F2

(F1 being the first frequency, F2 the second)
These are audible most of the time and especially when the level of the original tone is low. Hence they have a greater effect on psychoacoustic tuning curves than quadratic difference tones.

Quadratic difference tones are the result of

F2 – F1

This happens at relatively high levels hence have a lesser effect on psychoacoustic tuning curves (Moore 1986).

Combination tones can interact with primary tones resulting in secondary combination tones due to being like their original primary tones in nature, stimulus like. An example of this is

3F1 – 2F2

Secondary combination tones are again similar to the combination tones of the primary tone (Moore 1986).

Off frequency listening

Off frequency listening is when a listener chooses a filter just lower than the signal frequency to improve their auditory performance. This “off frequency” filter reduces the level of the masker more than the signal at the output level of the filter, which means they can hear the signal more clearly hence causing an improvement of auditory performance (Moore 2004).

Non-simultaneous masking

Non simultaneous masking is when the signal and masker are not presented at the same time. This can be split into forward masking and backward masking. Forward masking is when the masker is presented first and the signal follows it. Backward masking is when the signal precedes the masker (Moore 1998).

Sound masking Systems

The effect of auditory masking is used in Sound masking systems. These are audio systems that broadcast White noise for the purpose of hiding an unwanted sound. The unwanted noise may be intermittent sounds from machinery, people or other sources. Usually, this sound is filtered to provide the best effect of hiding the unwanted noise.

Chocholle, R. (1975). The effects of masking, both total and partial, on sounds of short duration by homolateral sounds of long duration: Journal de Psychologie Normale et Pathologique Vol 72(1) Jan-Mar 1975, 5-22.

Code, C., & Muller, D. (1980). Comments on paper: The long-term use of an automatically triggered masking device in the treatment of stammering: British Journal of Disorders of Communication Vol 15(2) Sep 1980, 141-142.

Docherty, E. M. (1973). The effects of reducing and masking the auditory cues accompanying performance of select gross motor tasks on the performance of those tasks: Dissertation Abstracts International Vol.

Fisiloglu, A. G. (1992). The effects of contralateral noise on the perception of puretones and speech: A study of children with auditory processing difficulties and normal controls: Dissertation Abstracts International.

Ison, J. R., & Agrawal, P. (1998). The effect of spatial separation of signal and noise on masking in the free field as a function of signal frequency and age in the mouse: Journal of the Acoustical Society of America Vol 104(3, Pt 1)) Sep 1998, 1689-1695.

Robinson, C. E., & Pollack, I. (1973). Interaction between forward and backward masking: A measure of the integrating period of the auditory system: Journal of the Acoustical Society of America Vol 53(5) May 1973, 1313-1316.

Rosen, S., & Eva, M. (2001). Is there a relationship between speech and nonspeech auditory processing in children with dyslexia? : Journal of Speech, Language, and Hearing Research Vol 44(4) Aug 2001, 720-736.

Versfeld, N. J., & Dreschler, W. A. (2002). The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners: Journal of the Acoustical Society of America Vol 111(1,Pt1) Jan 2002, 401-408.