Introduction:
Since the introduction of digital signal processing (DSP) hearing instruments, the potential advantages of DSP devices over analog or digitally programmable devices has been the subject of much debate. Indeed, if the signal processing schemes incorporated into any DSP device are based on amplitude manipulation only (the "tool box" of analog technology), there is little reason to expect superior performance by digital systems. However, mere amplitude manipulation is a gross under-utilization of the processing power of DSP.
Rather, once a signal has been digitized, it can be infinitely manipulated, not only with regard to amplitude, but spectral content and temporal elements can also be manipulated to maximize speech audibility in a variety of acoustic environments. Once a signal has been digitally coded, analytical functions can be invoked, creating "interactive" signal processing possibilities.
These dramatic signal processing venues, including, but not limited to frequency, intensity and temporal manipulation, coupled with analytical processing capabilities, represent the true "tool box" of DSP instruments. It is only now that these dramatic resources are being used in DSP-based hearing instrument designs.
Several of the DSP-based signal processing functions designed into the Canta product from GN ReSound, take full advantage of this DSP "tool box" and represent a dramatic departure from more traditional uses of DSP technology.
Some of these innovations and applications will be described in this article.
Adaptive Directionality
Hearing instruments incorporating directional microphones have been available since the early 1970s. The polar response from these classic directional microphones typically approximates a cardioid pattern (maximal suppression to the rear at around 180°). Well designed, classic directional microphones generally offer a 3-4 dB enhancement in signal-to-noise ratio (SNR). Despite the SNR advantage associated with early directional microphones, some practical disadvantages in everyday listening environments were apparent: most notably, the lack of an omni-directional choice if desired or required by the patient.
An important advancement in directional microphone technology took place with the introduction of "multi-microphones." Multi-microphone technology employs two separate omni-directional microphones designed to electronically switch between omni and directional modes. While multi-microphone technology has helped overcome some of the disadvantages of traditional directional microphones, these microphone arrays still have a fixed directional pattern.
In real-life use, however, noise interference is not always static, nor does it come from the same direction. Typical real-world listening environments contain multiple and rapidly changing sound sources. Specifically, the desired speech signal is not always in front of the patient, and the noise does not always originate from the rear!
Digital signal processing can be utilized to enhance the effectiveness of multi-microphone technology. By independently digitizing the signals that come from each microphone, DSP algorithms can be constructed to introduce a programmed electronic delay between the two microphone signals. Changing this delay changes the polar pattern, thus providing an opportunity to program various polar patterns into a single instrument.
An additional "DSP driven" directional feature available in the Canta 7 product line from GN ReSound is Adaptive Directionality. The Adaptive Directionality algorithm incorporates a unique analytical function designed to determine the spatial location of the most intense background noise source. Once the noise source location has been identified, the directional microphone's digital delay is automatically adjusted so the polar plot null points directly toward the identified noise source location.
One part of the adaptive directional algorithm measures the sound coming from the front direction only, based on differences in inputs from both microphones. Simultaneously, another part of the algorithm uses both microphones to constantly measure the sound originating from the sides, from behind or somewhere in between, thus essentially picking up the noise contribution and signature from the environment.
This noise contribution is passed through an adaptive filter and subtracted from the main signal thus minimizing the noise content in the output signal. The effect is to create an instant polar pattern - hyper-cardioid, cardiod, bi-directional or anything in between - optimized for any situation with the null of the combined polar pattern always pointed in the direction of the dominant noise source.
Since this analytical process occurs every 4 milliseconds (250 times per second), the resulting polar plot reconfiguration is occurring in real time, and is virtually instantaneous, reacting to changes in noise source location that occur as the noise source location changes, or as the patient moves.
Automatic Microphone Matching
In order for a dual omni-directional microphone system to work effectively, it is important for the microphones to be identical in terms of phase and sensitivity.
Achieving microphone match at the manufacturing level is often accomplished by purchasing "twin" microphone pairs from the microphone manufacturer. Unfortunately, twin microphones are relatively expensive, and both microphones must be replaced in the event one fails.
In digital systems, it is possible to feed the input signals from each of these microphone to their own separate analog-to-digital (A/D) converter. With each microphone signal independently digitized, it is possible to measure and digitally compensate for the spectral and intensity differences that are identified.
Obtaining or digitally creating perfectly matched microphones in fresh-from-the-factory dual microphone hearing instruments offers no guarantee they will remain matched over time. All hearing instrument microphones are susceptible to changes in performance because they operate under tortuous conditions. The change in microphone performance over time is referred to as "drift." Drift occurs when the microphone's characteristics gradually move away from their original factory specifications.
Thompsen (1999) noted that it might become possible to remedy any microphone drift. if the DSP circuitry could be made to "automatically detect the sensitivity difference and to continually correct it as a standard part of the signal processing".
Automatically detecting changes in microphones and digitally compensating for them on an ongoing basis is now incorporated into the Canta digital hearing instrument design. Every time the Canta 7 instrument is switched on, a microphone sensitivity analysis sequence is activated. This analysis sequence determines the current sensitivity of each microphone and compares these measurements to identify differences. Identified differences are "corrected" by digital compensation, insuring that the two microphones remain matched while the instrument is being used.
Spectral Enhancement
There are a multitude of acoustic cues used to decode the speech signal. One cue is the spectral shape of the signal measured over a brief time period. Vowel sounds are characterized by their formants, which are a series of signature peaks within their spectrum. Vowel identification is highly dependent on the first three formants. Formants also encode information regarding the consonants adjacent to vowels. This phenomenon is called co-articulation.
Consonants usually have much lower amplitude than vowels. Therefore, consonants are more likely to be "drowned out", or "masked" by noise. Additionally, consonants may be rendered inaudible due to the elevated thresholds associated with the patient's hearing impairment.
Co-articulation allows consonants to be "understood" when the formants of the adjacent vowel are audible, because the vowels carry information about those consonants. In short, being able to identify the first three formants (i.e., the spectral peaks) of a speech sound is very important for reliable word recognition.
Complex sounds within our environment, such as speech and music, contain a variety of different frequencies. Just as a prism separates different colors from white light, the basilar membrane (BM) separates spectral components from a complex sound and refers them to different BM locations. Every spot on the BM represents a specific frequency. Hair cells respond best to tones with frequencies that correspond to the place on the BM at which they are located. However, hair cells are also stimulated by tones at higher and lower frequencies. As a result, the representation of a sound spectrum in the ear is a very complex pattern of hair cell excitation across the basilar membrane.
If spectral peaks in the acoustic signal, such as the formants of speech, do not rise high enough above the "valleys," they may be dulled to the point where they are no longer identifiable by the ear. If hair cell absence occurs in key areas of the basilar membrane, the formants of the input signal may not be coded by the impaired ear. If these formants are not identifiable, the cues encoded by them will be useless for speech understanding and the hearing impairment becomes a communication handicap that is not overcome by simple amplification.
One way to overcome these formant identification barriers is to enhance the spectral features in the acoustic signal. That is; to exaggerate the spectral contrast (the difference between the spectral peaks and valleys) of the input signal to the point where the internal representation of this spectrally enhanced sound in the impaired ear resembles the internal representation of the natural sound in a normal-hearing ear.
Spectral Enhancement (SE) enhances the difference between the peaks and valleys by "stretching" the spectrum along the vertical axis. At the same time, the signal is scaled to ensure that the loudness of the enhanced signal equals the loudness of the original sound. These operations lower the spectral level in the valleys, which consists mainly of noise. At the same time, it leaves the peaks, which are dominated by the speech, largely unchanged.
Reducing the noise level in the spectral valleys is a difficult task and it cannot be achieved by the "classic" Noise Reduction (NR) algorithms. Classic NR algorithms analyze temporal fluctuations of the signal envelope in a limited number of spectral bands. If the envelope is steady; the signal in the band is likely to be dominated by noise. Such a band does not generally contain information which will enhance speech intelligibility and therefore, the hearing aid will not amplify it. However, if the temporal fluctuations in a band are significant, the signal in that band is likely to be dominated by meaningful speech and the hearing aid amplifies this band. This approach is very efficient for strong maskers that dominate the signal in certain frequency bands over a relatively long time.
While very efficient for environments with strong, steady background noises that have most of their energy concentrated in one frequency region, these "classic" noise-reduction algorithms are unable to selectively attenuate noise in the spectral valleys between the peaks of formants that change over time. For such situations, SE is an alternative.
Because speech is a rapid succession of individual sounds, the formants (i.e., the location of the spectral peaks) change over time and so do the spectral valleys between them. These movements are very fast and the algorithm must be able to track them. SE acts almost instantaneously. In fact, the SE adjusts itself 250 times per second. Such a fast reaction is needed to follow the rapid movement of formants in fluent speech.
The SE algorithm must not only be quick; it must also work with laser-sharp precision. The SE algorithm works in 64 bands, each 125 Hz wide. This high resolution is needed to increase the depth of the spectral valleys without reducing the tips of the peaks. The SE algorithm's resolution is much higher than that of "classical" NR algorithms.
Feedback Suppression versus Feedback Management:
Acoustic feedback occurs when amplified sound from the hearing instrument receiver returns to the microphone. This is typically the result of sound escaping through instrument vents, earmold vents or slit leaks. If the attenuation in the feedback path is less than the gain in the amplifier path, the feedback signal will be greater than the original signal. The feedback signal will be re-amplified, thereby forming an acoustic feedback loop in which the signal will grow to the maximum output of the hearing aid within a tiny fraction of a second. Approaches to managing feedback typically involve attempts to increase the feedback path attenuation. These include vent reduction, modification or remake of earmold/shell, changing the style of the hearing aid, or reducing gain in the frequency region where feedback is occurring.
In contrast, active feedback suppression can monitor the input of the hearing aid for the presence of feedback and eliminate it while the instrument is in use. To accomplish this, the instrument can create a signal which is opposite in phase to the feedback signal and add it to the input. In this way, the feedback signal is effectively cancelled. Active feedback suppression is not possible without digital technology. Moreover, active feedback suppression requires such hefty computations that most digital hearing instruments lack the signal processing power to carry it out.
To test the maximum attainable gain with and without the digital feedback suppression (DFS), KEMAR was set up in an anechoic chamber. The gain of the Canta was set to the maximum allowable for stability without the DFS system active. The measurement was repeated with the DFS on. The maximum attainable gain for stability was found to be 10-15 dB greater with the DFS system active. Objective and clinical results with the DFS system indicate that the system can be expected to yield greater usable gain of about 10 dB.
Another way to utilize the DFS system is to provide greater comfort without having to reduce gain. For example, larger vents and looser earmolds/shells may provide some physical comfort relief and may reduce the occlusion effect.
Fast Acting Noise Reduction in 14 Bands:
Noise reduction schemes have become popular in digital signal processors. Most noise reduction strategies monitor modulation differences in a signal to determine if the signal resembles speech versus noise.
For example, "clean" speech in quiet has a modulation dynamic range of approximately 30 dB while many noise sources, i.e. a fan or engine may have a modulation dynamic range of less than 5 dB. Speech babble may have a modulation dynamic range of about 8 dB.
Canta's powerful noise reduction capabilities are based on a 14-band fast-acting noise reduction, which reduces background noise without sacrificing speech understanding. "LASER" (Laser Accurate Speech Enhancement and Recognition) continually "monitors" the sounds in a user's environment and determines whether the signals are speech-like, noise-like or a combination of both based on their modulation characteristics,
When the processor determines a signal resembles a "non-speech" signal, i.e., something with minimal modulation, that signal is reduced. However, highly variable modulation (probable speech sounds) are preserved so that no degradation in intelligibility is observed.
Canta 7 is unique regarding noise reduction strategies in two key areas; First, the Canta 7 uses a 14-band scheme to monitor the modulation of the incoming signal. These discreet frequency bands allow for greater analysis of the incoming signal. Bands which contain noise can be diminished while bands which contain important speech cues are not sacrificed. The multi-band noise reduction algorithm takes advantage of spectral mismatches between the noise and speech. This is particularly significant when the competing background noise has a narrow-band response. In this situation, the algorithm reduces the gain only in the frequency regions where noise is present. In contrast, broad-band systems or systems with fewer bands, reduce the gain not only in the frequency region where noise is present but also in other (potentially important) frequency regions where noise is not present.
Second, through advanced design and FFT processing, the Canta 7 uses fast acting time constants to allow the 14-band processor to reduce the level of noise in discreet bands on the order of milliseconds. The processing time frame is so small that the system can identify noise segments that occur between spoken words, or even spoken syllables within a given frequency band. The Canta 7 noise reduction algorithm works so quickly that during those instants when there is no speech signal, the noise in the band is identified and attenuated until the speech signal returns. This attenuation of noise, which occurs between syllables, gives the listener the perception that the speech signals "pop" out of the background noise.
SUMMARY:
The true advantages of DSP processing are not appreciated by merely manipulating amplitude characteristics. Rather, DSPs that take advantage of the entire digital tool-box; manipulation of amplitude, frequency, temporal characteristics and analytical processing, hold the key to better word recognition and improved speech-in-noise performance.
For the first time, these sophisticated DSP schemes are finding their way into hearing instrument designs, moving them further away from amplifiers, and into the more beneficial category of signal processors.
As hearing instruments and technology become increasingly more sophisticated, GN ReSound has taken a proactive stance integrating these new technologies into their hearing instruments, and providing hearing healthcare professionals with educational tools.
This article represents an overview of the advances available in the new CANTA line of hearing instruments. These advances characterize the new frontier in DSP strategy.
ACKNOWLEDGEMENT:
Portions of this article were previously published in the Hearing Review. We thank the Hearing Review for allowing us to re-edit, reformat and re-publish these articles in this current form.
References and Suggested Readings:
Fabry D: Do we really need digital hearing aids? Hear J
1998;51;11,30-33.
Sweetow RW: Selection considerations for digital signal processing hearing aids. Hear J 1998;51;11,35-42.
Jansen JHM, Bokhorst Amv: Telephone hybrid with an automatic dual DSP feedback canceller. J Audio Engineering Soc 1990;38, 355-63
Bisgaard N: Digital feedback suppression - Clinical experiences with profoundly hearing impaired. In Beilin J & Jensen GR, ed. Recent Developments in Hearing Instrument Technology, 15th Danavox Symposium, 1993.
Delta Acoustics & Vibration, Technical-Audiological Laboratory - TAL. Technical Report, Danalogic 163D - DFS function, 1998.
Villchur E: Signal processing to improve speech intelligibility in perceptive deafness. J Acoust Soc Am, 1973; 53: 1646-1657.
Pluvinage V: Rational and development of the ReSound system. In RE Sandlin, ed. Understanding Digitally Programmable Hearing Aids, Allyn & Bacon, Boston, 1994; 15-40
Hickson LMH: Compression amplification in hearing aids. Am J Audiol, 1994; 11: 51-65
Dillon H: Compression? Yes, but for low or high frequencies, and with what response times? Ear Hear, 1996; 17: 287-307
Killion M. The SIN report: Circuits haven´t solved the hearing-in-noise problem. The Hearing Journal 1997; 50:10, 28-32.
Kochkin S. Customer satisfaction and subjective benefit with high
performance hearing aids. The Hearing Review 1996; 3:12; 16-26.
Ricketts T. and Dhar S. Comparison of Performance across Three
Directional Hearing Aids. J American Academy Audiology 1999; 10:4; 180-189.
Valente M. The bright promise of microphone technology. The Hearing Journal 1998; 51:7; 10-19.
Wolf R. P., Hohn W., Martin R. and Powers T. A. Directional Microphone Hearing Instruments: How and why they work. High Performance Hearing Solutions 1999; 3:1; 14-25.
Stevens, K.N. and Blumstein, S.E. (1981) "The search for invariant acoustic correlates of phonetic features", in Perspective on the study of speech, edited by Peter. D. Eimas and Joanne L. Miller, Lawrence Erlbaum Associates, Hillsdale, New Jersey.
Zwicker, E., and Feldtkeller, R. (1967). Das Ohr als Nachrichtenempfänger. (Hirzel-Verlag, Stuttgart, Germany). Available in English translation by H. Müsch, S. Buus, and M. Florentine as The Ear as a Communication Receiver (Acoust. Soc. Am., Woodbury, NY, 1999).
Glasberg, B.R., and Moore, B.C.J. (1986) "Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments", J. Acoust. Soc. Am. (79), 1020-33.
Tyler, R.S. (1986) "Frequency resolution in hearing-impaired listeners" in Moore, B.C.J., editor. Frequency selectivity in hearing. London, Academic Press, p.309-71.
Leek, M.R., Dorman, M.F., Summerfield, Q. (1987) "Minimum spectral contrast in vowel identification by normal-hearing and hearing-impaired listeners" J. Acoust. Soc. Am. (81), 148-54.