Abstract:
High-performance hearing aids require flat insertion gain (before processing) and low distortion, so that when signal processing is introduced the pre-processed response of the aid does not bias the processing. High-performance hearing aids need to process the signal to compensate for the patient's perceptive distortions (primarily accentuated high-frequency hearing loss and recruitment), and this processing must be adjusted to the hearing of the individual patient. The goal of the processing (compression combined with frequency-response shaping) is to fit the speech band into the patient's residual dynamic range of hearing, amplifying each element of speech to the same relative level in the patient's dynamic range of hearing as the unamplified element has in the normal dynamic range of hearing.
Introduction:
Effective performance in sound reproducing equipment refers to the accuracy with which the reproduced sound imitates the original live sound. Effective performance in hearing aids involves more than fidelity to the original sound: it implies a restoration of normal hearing - to the extent possible - to a patient with impaired hearing.
To the hearing-impaired listener the fidelity of a hearing aid is not fidelity to the input sound but fidelity to the normal perception of that sound.
Restoration toward normal hearing involves three elements in the design and adjustment of the hearing aid: 1) The hearing aid must not introduce unwanted changes to the signal reaching the eardrum; 2) the signal needs to be processed to compensate for perceptive distortions created by the hearing impairment; and 3) the processing needs to be adjusted to the hearing of the individual patient.
1- Real-ear Frequency Response of the Hearing Aid:
To measure the frequency response of an amplifier, the input of the amplifier is swept over the frequency spectrum with a sine-wave signal, and with the amplifier output properly loaded, the output signal is recorded. If the graph of output amplitude versus frequency is identical or almost identical to the graph of input amplitude versus frequency, the amplifier is said to have flat frequency response.
Determining the effective frequency response of a hearing aid, which is to say the frequency response of the aid when it is in the ear, is a more complex procedure. The common way to measure hearing-aid response is with the output of the aid connected to a 2-cc coupler. This is a legitimate method, but the desired result is not flat response, for two reasons. First, the acoustic load presented to the hearing aid by the 2-cc coupler is not an accurate simulation of the load presented by a real ear: the coupler does not match the increase of canal impedance at higher frequencies (and consequent increase of hearing-aid response) created by the increased stiffness of the eardrum as the frequency goes up. Second, the physical presence of a hearing aid or earmold in the ear introduces unwanted changes to the transmission path of the ear; these changes affect the signal reaching the eardrum and must be counted as part of the response of the hearing aid. Neutral frequency response in a hearing aid is not flat frequency response but flat insertion gain.
Insertion gain is the difference the operating hearing aid makes to the gain of the open ear. It is the aided minus the unaided gain of the ear. When a hearing aid has flat insertion gain over its frequency range, the sound pressures it produces at the eardrum for a given sound stimulus are no different, except in amplitude, from the pressures that would be produced at the eardrum by the same stimulus at the open, unaided ear.
The sound-transmission path of the unaided ear has anything but flat frequency response. The most important influence on the frequency response of this path is the air-column resonance of the ear canal. The average ear has a primary canal resonance at about 2700 Hz, which creates a broad peak in the frequency response of the ear's transmission path of approximately 15 dB. The physical presence of a hearing aid or earmold that closes off the ear canal reduces the working length of the canal, typically by about half, which doubles the frequency of canal resonance. Closing off the canal also changes the mode of resonance from quarter-wave to half-wave, doubling the resonance frequency a second time. A third effect of the physical presence of the hearing aid is that the edge correction, which increases the effective length of an open canal, is lost. The net effect of these changes is to raise the resonance frequency of the canal to the 13-kHz region, well beyond the range of the typical hearing aid.
When a new component is inserted into an electronic or acoustical system it is necessary to take into account not only the frequency response of the component itself, but the possible effect of that component on the frequency response of the rest of the system. The physical presence of a hearing aid in the ear removes an acoustical 2700-Hz, 15-dB response peak from the signal reaching the eardrum. Removing this peak has the same effect as introducing an electronic 15-dB response dip into the hearing-aid circuit. That original 15-db peak is part of the natural sound that we hear in speech and music, and the hearing aid needs to make up for the loss.
The 2-cc coupler frequency response that corresponds to flat insertion gain of a hearing aid in an average ear has been measured and plotted by Killion and Revit (1993). A family of these "CORFIG" curves (Coupler Response for Flat Insertion Gain) is shown in Fig. 1. Each curve represents the coupler response that predicts flat insertion gain for a particular type of hearing aid -- BTE, ITE, etc.
Fig. 1. CORFIG curves (coupler response for flat insertion gain). Solid line is for a BTE hearing aid; dashed line for a full-concha ITE; dotted line for a small ITC; and bottom dashed line for a deep canal CIC (from Killion Revit, 1993).
The 2700-Hz peak in the CORFIG curve that makes up for the loss of the 15-dB ear canal-resonance peak is only about 5 dB, because the 2-cc coupler underestimates the real-ear high-frequency response of the hearing aid, as described above and as shown in Fig. 2. At 2700 Hz the 2-cc coupler underestimates real-ear response by about 10 dB.
Fig. 2. Comparison between eardrum and 2-cc sound pressure produced by a hearing aid (from Killion & Revit, 1993) .
To illustrate what serious errors can be made when the CORFIG curves are not taken into account, Fig. 3 shows the 2-cc coupler frequency-response curves published by a hearing-aid manufacturer for two of its former models, an ITE and a BTE. The ITE curve was a good approximation of its CORFIG curve; the BTE curve was not. Yet in spite of the large differences in insertion gain between these two models, the manufacturer prescribed the same fitting for each. Whatever processing was used to program the BTE aid, the processing was biased by the inadvertent changes in BTE insertion gain. The difference between the 2-cc coupler responses of the two models is shown at the bottom of Fig. 3.
Fig. 3. Comparison between 2-cc coupler response curves of ITE and BTE models of the same hearing-aid manufacturer (not current models). The same fitting program was prescribed for each.
Hearing aids with flat insertion gain and low distortion at all useful levels could be described as "high fidelity" for normal listeners. Sounds of the outside world would hardly be changed by such aids. Hearing aids are not designed for normal listeners, but experimental data (Killion, 2003) demonstrate that hearing-impaired listeners show the same preference for flat insertion gain and low distortion (aside from processing) that normal listeners do.
2- Compensation for Perceptive Distortions:
The second design requirement of a high-performance hearing aid is that the hearing aid compensate for the perceptive distortions of the hearing-impaired patient.
There are many kinds of perceptive distortion. Some patients, for example, particularly those with severe-to-profound hearing impairment, lose hearing sensitivity to changes of frequency, either for sequential sounds or for simultaneous sounds such as occur in speech formants. But the great majority of persons with non-conductive hearing impairment suffer primarily from two perceptive distortions that affect speech understanding: the change of hearing loss with frequency -- usually increased hearing loss at higher frequencies -- and the change of hearing loss with sound level, a phenomenon called recruitment. Fortunately these distortions can be compensated, in varying degree, by the gain characteristics of the hearing aid.
Steinberg and Gardner (1937), who first analyzed recruitment, called it "variable hearing loss," referring to the variation of hearing loss with sound intensity. A person with recruitment -- and this would be almost anyone with a non-conductive impairment -- hears intense sounds at near-normal or even normal loudness, which means that little or no amplification is needed at high sound levels. As sound intensity is decreased, however, the loss gets progressively worse, and more and more gain is required from the hearing aid. The degree of recruitment usually varies with frequency, typically increasing at higher frequencies. Recruitment makes weak elements of speech inaudible, and has a direct effect on both aided and unaided speech intelligibility.
If the proper amount of hearing-aid gain is provided to bring the higher-level elements of speech to comfortable loudness for a hearing-impaired person with recruitment, the weaker consonants (like the fricatives, which make a major contribution to speech understanding) will be underamplified, and will be either inaudible or too soft to identify. If the gain is then increased to bring the consonants to a comfortable loudness, the high-level elements of speech become uncomfortably loud, and the sound of a slammed door can be painful.
This last phenomenon has given rise to one of the common misconceptions about recruitment, that recruitment creates an abnormal intolerance to intense sound. Intolerance to intense sound occurs only when the intense sound is overamplified (incidental to the high gain needed for weak sounds). Hearing-impaired persons with recruitment typically have normal or near-normal discomfort levels. But the amplitude range between their threshold levels and discomfort levels, and the range between their threshold levels and the levels at which they prefer to listen to speech, is reduced, which is to say they have a reduced dynamic range of hearing.
Figure 4 shows the normal dynamic range of hearing for conversational speech, between hearing threshold and an equal-loudness contour anchored at 1 kHz to the highest level of the speech at that frequency. The frequency/amplitude band of conversational speech, shown in dashed line, was measured by Dunn and White (1940). Dunn and White didn't record the lowest 20% or so of speech levels, so the bottom line of the speech band needs to be extended downward a bit.
Fig. 4. Proportionate positions of conversational speech levels relative to the amplitude range between the normal threshold of hearing and the ISO 74-phon equal-loudness contour (from Villchur, 1973).
Figure 4 shows the levels of the conversational speech band relative to the dynamic range of hearing of normal listeners. It can be seen that the lowest levels of conversational speech lie well above hearing threshold, and since normal discomfort levels are in the neighborhood of 105 to 115 dB SPL, it can also be seen that the highest levels of this speech lie well below discomfort.
Figure 5 presents the same information that appeared in Fig. 4, but for a hearing-impaired listener. The threshold curve and the equal-loudness contour represent the average of the hearing thresholds and equal-loudness contours of six hearing-impaired subjects from a study by Villchur (1973). Equal-loudness contours were measured for each subject, using a 1-kHz reference level pegged to the preferred amplified speech level of that subject. The area between the threshold and equal-loudness curves is thus the average residual dynamic range of hearing for conversational speech for these subjects.
The speech band in Fig. 5 is the same as the one in Fig. 4, except that it has been amplified to the subjects' preferred level for listening to speech. The speech band will not fit into the subject's residual dynamic range of hearing: recruitment has made this dynamic range too narrow to accept the entire speech band. The parts of the speech that lie below threshold are inaudible, and the parts that lie just above threshold are barely audible.
Fig. 5. Proportionate positions of amplified speech levels relative to the average residual dynamic range of hearing for speech for six hearing-impaired subjects (from Villchur, 1974).
It is possible to simulate for normal listeners, with a good degree of confidence in the validity of the simulation, how a person with the hearing characteristics illustrated in Fig. 5 hears intensity relationships in speech. The recruitment is simulated by a bank of electronic expanders, each expander covering part of the frequency range (an expander imitates recruitment because it reduces gain as the signal level is reduced). The accentuated high-frequency loss is simulated by a post-expansion high-frequency attenuation circuit.
This type of electronic simulation was validated in a study by Villchur (1974), in which unilaterally impaired subjects compared the simulation in their good ears with the real thing, presented at the same volume to their impaired ears. The validity of this type of simulation was later confirmed in a study by Duchnowski and Zurek (1995). The simulation also has a certain face validity: the same portions of the speech that are inaudible or weak to the hearing-impaired listener are made inaudible or weak to a normal listener by the electronic simulator. The simulations of both Villchur and of Duchnowski and Zurek demonstrated that recruitment combined with accentuated high-frequency loss can seriously erode speech intelligibility.
Speech can be made more intelligible to the subject represented in Fig. 5 by processing the signal -- manipulating the relative gain of the hearing aid at different levels and frequencies -- so that the entire speech band fits into the residual dynamic range of hearing of the subject and more speech cues are audible. The speech band of Fig. 5 needs to be squeezed vertically: this is done by amplitude compression, which amplifies the low-level elements of speech more than the high-level elements. More than 65 years ago Steinberg and Gardner (1937) wrote: "Owing to the expanding action of this type of [variable] loss it would be necessary to introduce a corresponding compression in the amplifier..." The speech band needs to be squeezed more at high frequencies than at low frequencies because the subject's dynamic range of hearing in the high-frequency range is narrower: this variable compression is achieved by making the compression ratio in the high-frequency region greater than in the low-frequency region.
The speech band must then be bent upward - high-frequency emphasis must be applied -- to lift the speech to its proper position within the subject's dynamic range of hearing in the high-frequency region. The aim is to compress and position the amplified speech band so that each speech element will have the same relative level in the subject's residual dynamic range of hearing as the unamplified speech element has in a normal dynamic range of hearing. Achievement of this goal would imply that the subject hears speech elements with a relative loudness that is the same as or similar to the loudness heard by normal listeners for those elements. This has not been proven, but whether or not it is true, previously inaudible speech elements are made audible.
High-frequency emphasis alone won't do the job: it will lift the high-frequency portion of speech, but the dynamic range of the speech band won't fit into the residual dynamic range of hearing of the subject. Compression alone won't do the job either: it will compress the speech band so that it can fit into the residual dynamic range of hearing, but it will not position the band so it does fit.
Figure 6 shows the result of applying both compression and high-frequency emphasis to the speech, each adjusted (in the all-important fitting process) to fit the residual dynamic range of hearing of this subject. The proportionate position of the speech band within the subject's residual dynamic range of hearing is close to the proportionate position of the unamplified speech band in the normal dynamic range of hearing in Fig. 4.
Fig. 6. Effect of compression/frequency-response shaping (heavy dashed lines) on the speech band of Fig. 4. The processed speech has approximately the same relative position in the residual dynamic range of hearing as that of the unamplified speech of Fig. 4 in the normal dynamic range of hearing.
Today the use of compression in hearing aids is commonly accepted -- most hearing aids have some version of it -- but earlier it was attacked vigorously. Opponents of fast multichannel compression such as Plomp (1988) said it degraded intelligibility because it reduced the intensity contrasts1 between different elements of speech, and these contrasts were claimed to be important cues for speech understanding. Compression does indeed reduce intensity contrasts -- that is what it is intended to do -- but it compensates for and restores to or toward normal the exaggerated loudness contrasts with which a patient with recruitment hears intensity contrasts. For the hearing-impaired subject represented in Fig. 5, some of the intensity contrasts in speech are between audible and inaudible elements. Reducing these intensity contrasts by compression amplifies the inaudible elements to audibility.
Speech in Noise:
Coker (1974) pointed out that speech contains many redundant cues, which provide a reserve against the loss of cues to masking and make it possible for normal listeners to understand speech in noisy environments. But hearing-impaired listeners don't hear many of the cues, even after undistorted but unprocessed amplification. When the impoverished set of cues available to them suffers a further loss from masking, there are insufficient redundant cues left for them to fall back on. Restoring speech cues by compression/frequency response processing increases the reserve of available cues and increases the impaired listener's ability to understand speech in noise.
Noise is usually at a lower level than the target speech, and therefore compression increases the relative level of the noise. For patients with recruitment, compression degrades the signal-to-noise ratio to what it would have been without the recruitment. But the speech cues restored to audibility by processing typically increase speech intelligibility in noise more than that intelligibility is decreased by the higher noise level created by compression. This is so even though hearing-impaired listeners almost always require a better signal-to-noise ratio for understanding speech than do normal listeners. Many patients don't notice the increased noise level created by compression because they are distracted by the increased clarity of the target speech.
Binaural fittings significantly improve speech understanding in noise. The phenomenon is called binaural release from masking. Binaural summation of loudness also reduces the SPLs required for a given loudness (usually by about 4 dB for hearing-impaired patients), and thus alleviates the feedback problem.
Attempts to improve signal-to-noise ratios with electronic circuits that discriminate between speech and noise have not been very successful, largely because the offending noise typically consists of competing speech. Unlike the human brain, an electronic circuit does not know how to tell the difference between target and interfering speech. However, acoustical noise-reduction systems - directional microphones or microphones placed close to the talker - have done well.
Summary:
An effective hearing aid should compensate for the perceptive distortions of the patient; it should provide a neutral base (flat insertion gain and low distortion) that doesn't bias the compensation processing; and it should be fitted to the hearing characteristics of the individual patient. Fulfilling two out of three of these requirements is not enough.
1- The reduction of intensity contrasts by compression is sometimes referred to in different terms: a loss of temporal modulation, a distortion of the temporal envelope, a violation of spectral integrity (when the contrasts are between sounds of different frequency), or a reduction of the modulation transfer function. They mean the same thing.
References:
Coker, C. H. (1974). Speech as an error-resistant digital code. Journal of the Acoustical Society of America 55, 476(A).
Duchnowski, P. & Zurek, P. M. (1995). Villchur revisited: another look at automatic gain control simulation of recruiting hearing loss. Journal of the Acoustical Society of America 98, 3170-3181.
Dunn, H. K. & White, S. D. (1940). Statistical measurements on conversational speech. Journal of the Acoustical Society of America 11, 278-288.
Killion, M. C. & Revit, L. J. (1993). CORFIG and GIFROC: Real ear to coupler and back. In Studebaker & Hochberg (Eds.) Acoustical factors affecting hearing-aid performance. University Park Press, Baltimore.
Killion, M. C. (2003). Myths that discourage improvements in hearing aid design. The Hearing Review, Jan. (in press).
Plomp, R. (1988). The negative effect of amplitude compression in multichannel hearing aids in the light of the modulation transfer function. Journal of the Acoustical Society of America 83, 2322-2327.
Steinberg, J. C. & Gardner, M. B. (1937). The dependence of hearing impairment on sound intensity. Journal of the Acoustical Society of America 9, 11-23.
Villchur, E. (1973). Signal processing to improve speech intelligibility in perceptive deafness. Journal of the Acoustical Society of America 53, 1646-1657.
Villchur, E. (1974). Simulation of the effect of recruitment on loudness relationships in speech (demonstration disk bound in with article). Journal of the Acoustical Society of America 56, 1601-1611,