Interview with Michael Nilsson Ph.D., Sr. Research Associate, Sonic Innovations, Inc.
Share:
AO/Beck: Thanks for agreeing to this interview Dr. Nilsson. I'm honored to spend a few moments with you. If you don't mind, I'd like to start by learning a bit about you. Where did you get your doctorate?
SI/Nilsson: I got my Ph.D. in Psychology from the Cognitive Science Department at the University of California at Irvine.
AO/Beck: Do you have any formal degrees or credentials in audiology?
SI/Nilsson: No, I do not. My formal background is in experimental psychology. Of course I have spent many years working with audiologists and auditory scientists not only here at Sonic Innovations, but also for eight years at the House Ear Institute in Los Angeles, from 1990 to 1998.
AO/Beck: Tell me a little about your work at House?
SI/Nilsson: My principle project was working with an excellent group of scientists and clinicians to develop the HINT (Hearing In Noise Test, Nilsson et al., 1994, J.Acoust. Soc. Aa., 95(2), 1085-1099). The HINT is a set of recording used to measure speech intelligibility in quiet or noise. The test consists of recordings of 25 lists of 10 sentences which are presented in either quiet, or with a spectrally matched steady-state noise to measure an adaptive threshold, similar to an SRT. We have tried to coin the phrase RTS (Reception Threshold for Sentences) to distinguish HINT thresholds from clinical SRT. In essence, the RTS measures the presentation level where the sentences can be accurately repeated half the time. My co-authors on that project were Sigfrid Soli and Jean Sullivan. The HINT was based on a need to develop speech measures which would help show binaural advantages for a hearing aid development project. The HINT is one of only a few speech tests with sentences that have been carefully equated for difficulty and which has been normed for speech recognition in noise, although I recommend comparing the published norms to your own soundfield norms.
AO/Beck: If the readers want to learn more about the HINT, where can they find information regarding the test?
SI/Nilsson: It was published on CD by Starkey; I'm not sure if it is still available. Nonetheless, the best thing is to contact Dr. Soli at HEI. His email address is ssoli@hei.org.
AO/Beck: Thanks for the notes on the HINT. At this time, I'd like to change the discussion towards your work at Sonic Innovations, and in particular, I'd like to find out about expansion and compression.
SI/Nilsson: The basic definition of expansion is -- Expansion is the inverse of compression. Compression is a reduction of gain as you increase input level. Expansion is an increase in gain as you increase input level.
AO/Beck: Can you give me a specific example of how and when expansion would be used? Let's take for example a typical WDRC compression circuit with a kneepoint of 45 dB. What would happen if we had expansion in this circuit?
SI/Nilsson: In that example, the circuit is probably fine for conversational speech for a listener with a mild to moderate hearing loss. However, the person wearing that circuit may report a 'seashell' or 'ocean' sound quality for soft sounds in quiet environments because maximum gain is applied to the quietest inputs with a WDRC circuit. But, when you use expansion to turn down the gain below the kneepoint (say 45 dB), then any sound coming in at 30 dB would actually get less gain than the louder sound coming in at 45 dB. This works to improve the signal-to-noise ratio for soft speech because speech rarely, if ever, comes through at levels below 35 to 40 dB. So, if you reduce gain for sounds below the kneepoint, those sounds will not be competing with soft speech sounds, which are typically at or above the kneepoint. However, please keep in mind that when we talk about improving the signal-to-noise ratio, that really only occurs for very soft speech in a reasonably quiet room.
AO/Beck: Can expansion and compression work together in the same circuit?
SI/Nilsson: Yes, in our circuit, you can have expansion working on the very quiet speech sounds while compression works on the louder sounds. The point at which the processing changes from expansion to compression would be the kneepoint, and that can be varied by the audiologist.
AO/Beck: Compression seems somewhat intuitive at this time while expansion seems to be difficult to grasp. Is expansion a new technology?
SI/Nilsson: Expansion is used in several current DSP based hearing aids, though it is different than squelch which uses a very steep slope to turn the system on and off as you cross your kneepoint. If we look back, the Nicolet Phoenix (the first attempt at a digital hearing aid) used an expansion system to try to improve the 'peak-to-valley' ratio to enhance consonant perception while at the same time improving the signal-to-noise ratio. Expansion allows us to reduce the gain for the softest sounds—below the kneepoint, without impacting gain above the kneepoint. The goal is to maximize the signal-to-noise ratio for soft speech.
AO/Beck: What are typical attack and release times for expansion circuits?
SI/Nilsson: Expansion time factors are the same as compression time factors with regards to attack and release times.
AO/Beck: What is 'Speech Weighted Expansion'? (SWE)
SI/Nilsson: The kneepoint for SWE varies by frequency so the kneepoint and the slope of the expansion function attempt to track the long term spectrum of speech. So in essence, we have higher kneepoints in the lower frequencies, and lower kneepoints in the higher frequencies to exploit the idea of enhancing soft speech in low noise environments.
AO/Beck: How would expansion circuits handle the 60 cycle humming which comes from the refrigerator and drives many patients out of their minds!
SI/Nilsson: Depending on the level of the hum, the low frequency kneepoint could potentially be adjusted to reduce the 60 cycle hum from the fridge. Of course this will vary based on the room acoustics and other factors. Nonetheless, that's a good example of one venue in which expansion circuits help reduce annoying quiet sounds which a WDRC circuit may potentially overamplify.
AO/Beck: What are the lowest kneepoints for the Sonic Innovations device?
SI/Nilsson: It varies. In the low frequencies the kneepoint ranges from 40 to 50 dB SPL; in the high frequencies the kneepoint varies from 20 to 30 dB SPL.
AO/Beck: When the nine channel SI device is operating, what does it cue on in order to make decisions about which sounds to pass and which sounds to attenuate? Does it look at temporal, spectral and amplitude characteristics?
SI/Nilsson: The SI device mainly cues on amplitude characteristics. The SI device calculates the envelope of the signal and operates on that information. The SI does not do an FFT. The device actually does a cascaded band-pass filtering of the signal and the input envelope across each band to determine the appropriate gain and output for that given band.
AO/Beck: In your experience and in your opinion, can you cite research which definitively proves that DSP (digital signal processing) technology is clearly superior to digitally controlled analog (DCA) technology?
SI/Nilsson: The digital advantage is still difficult to prove. There are no conclusive papers out there which categorically support the digital advantage. In many respects I think the issue is likely related to the tools and methods we use to measure success, as well as our expectation of why 'digital' is better. For instance, many of the studies look at DSP versus DCA but the DSP instrument is in fact using the same (or very similar) processing protocol as the DCA. That is, the DSP instrument is essentially a digital version of an analog process, and of course we expect it should perform about the same. The technology is certainly higher in DSP, but both processors are very good and a similar result is established. Only when we begin to compare DCA to devices with signal processing strategies that are only possible with DSP (like nine independent channels with expansion and compression, noise reduction and other features), can we expect to show a digital advantage. That is, as we move forward, we'll break away from the traditional fitting protocols and only the DSP instruments will be able to control and offer the various features. So, I think the digital advantage will soon be readily apparent.
AO/Beck: To wrap all of this up - What is the best way to measure the digital advantage of DSP instruments in a clinically meaningful manner?
SI/Nilsson: Probably among the better tools is the HINT. We should expect a better clinical result in difficult signal-to-noise ratios with novel DSP algorithms, and that's where the HINT allows us to go. The HINT allows us to define a range of performance (from most difficult to easiest noise conditions) which is more useful and more externally valid. Remember, one of the potential advantages of the new DSP algorithms is better speech understanding in noise.
AO/Beck: Michael, can you tell me anything regarding how the noise reduction system in your product works?
SI/Nilsson: In most models, a programming button is available to the user, allowing noise reduction in one program and no noise reduction in the other. Or, you could set one program to have low noise reduction and the other might have maximum noise reduction. We offer three levels of noise reduction, with three processes impacting the gain and temporal characteristics of the circuit. The noise detector takes 2.5 seconds to determine if a signal is noise or not, the level of noise reduction is adjusted based upon the calculated signal to noise ratio, and the AGC characteristics are maintained as when no noise reduction is present. This combination of processing leads to a measured attack and release time slightly longer than without noise reduction (approximately 40 msecs at 2000 Hz). Our processor measures the loudness of the noise and calculates a signal-to-noise ratio based upon the noise level relative to the overall level, and it automatically adjusts the amount of noise reduction based on this signal-to-noise ratio. In other words, for a high signal-to-noise ratio there is very little noise reduction while in a difficult signal-to-noise ratio, maximum noise reduction is applied. This variation in amount of noise reduction tracks the peaks and valleys of the signal envelope, thereby reducing the level of the noise between syllables.
AO/Beck: What is the system cueing on to determine what is noise and what is not noise?
SI/Nilsson: In essence, the system is looking at the modulation rate of the signal envelope. You might say it tries to answer the questions 'What is the noise' and 'What is the signal' and determine the signal-to-noise ratio. If the modulation rate of the envelope is lower, it is more likely that the sound is a non-desired noise and is targeted as noise. So the noise reduction system attempts to target steady-state noises based on amplitude characteristics.
AO/Beck: Thanks Michael, it has been a pleasure. You have brought up many issues and I think we may need to do a follow-up interview to further investigate these issues.
SI/Nilsson: Thanks Doug, I'd be happy to do that.
For more information on Sonic Innovations click here.
Click here to visit the Sonic Innovations website.
SI/Nilsson: I got my Ph.D. in Psychology from the Cognitive Science Department at the University of California at Irvine.
AO/Beck: Do you have any formal degrees or credentials in audiology?
SI/Nilsson: No, I do not. My formal background is in experimental psychology. Of course I have spent many years working with audiologists and auditory scientists not only here at Sonic Innovations, but also for eight years at the House Ear Institute in Los Angeles, from 1990 to 1998.
AO/Beck: Tell me a little about your work at House?
SI/Nilsson: My principle project was working with an excellent group of scientists and clinicians to develop the HINT (Hearing In Noise Test, Nilsson et al., 1994, J.Acoust. Soc. Aa., 95(2), 1085-1099). The HINT is a set of recording used to measure speech intelligibility in quiet or noise. The test consists of recordings of 25 lists of 10 sentences which are presented in either quiet, or with a spectrally matched steady-state noise to measure an adaptive threshold, similar to an SRT. We have tried to coin the phrase RTS (Reception Threshold for Sentences) to distinguish HINT thresholds from clinical SRT. In essence, the RTS measures the presentation level where the sentences can be accurately repeated half the time. My co-authors on that project were Sigfrid Soli and Jean Sullivan. The HINT was based on a need to develop speech measures which would help show binaural advantages for a hearing aid development project. The HINT is one of only a few speech tests with sentences that have been carefully equated for difficulty and which has been normed for speech recognition in noise, although I recommend comparing the published norms to your own soundfield norms.
AO/Beck: If the readers want to learn more about the HINT, where can they find information regarding the test?
SI/Nilsson: It was published on CD by Starkey; I'm not sure if it is still available. Nonetheless, the best thing is to contact Dr. Soli at HEI. His email address is ssoli@hei.org.
AO/Beck: Thanks for the notes on the HINT. At this time, I'd like to change the discussion towards your work at Sonic Innovations, and in particular, I'd like to find out about expansion and compression.
SI/Nilsson: The basic definition of expansion is -- Expansion is the inverse of compression. Compression is a reduction of gain as you increase input level. Expansion is an increase in gain as you increase input level.
AO/Beck: Can you give me a specific example of how and when expansion would be used? Let's take for example a typical WDRC compression circuit with a kneepoint of 45 dB. What would happen if we had expansion in this circuit?
SI/Nilsson: In that example, the circuit is probably fine for conversational speech for a listener with a mild to moderate hearing loss. However, the person wearing that circuit may report a 'seashell' or 'ocean' sound quality for soft sounds in quiet environments because maximum gain is applied to the quietest inputs with a WDRC circuit. But, when you use expansion to turn down the gain below the kneepoint (say 45 dB), then any sound coming in at 30 dB would actually get less gain than the louder sound coming in at 45 dB. This works to improve the signal-to-noise ratio for soft speech because speech rarely, if ever, comes through at levels below 35 to 40 dB. So, if you reduce gain for sounds below the kneepoint, those sounds will not be competing with soft speech sounds, which are typically at or above the kneepoint. However, please keep in mind that when we talk about improving the signal-to-noise ratio, that really only occurs for very soft speech in a reasonably quiet room.
AO/Beck: Can expansion and compression work together in the same circuit?
SI/Nilsson: Yes, in our circuit, you can have expansion working on the very quiet speech sounds while compression works on the louder sounds. The point at which the processing changes from expansion to compression would be the kneepoint, and that can be varied by the audiologist.
AO/Beck: Compression seems somewhat intuitive at this time while expansion seems to be difficult to grasp. Is expansion a new technology?
SI/Nilsson: Expansion is used in several current DSP based hearing aids, though it is different than squelch which uses a very steep slope to turn the system on and off as you cross your kneepoint. If we look back, the Nicolet Phoenix (the first attempt at a digital hearing aid) used an expansion system to try to improve the 'peak-to-valley' ratio to enhance consonant perception while at the same time improving the signal-to-noise ratio. Expansion allows us to reduce the gain for the softest sounds—below the kneepoint, without impacting gain above the kneepoint. The goal is to maximize the signal-to-noise ratio for soft speech.
AO/Beck: What are typical attack and release times for expansion circuits?
SI/Nilsson: Expansion time factors are the same as compression time factors with regards to attack and release times.
AO/Beck: What is 'Speech Weighted Expansion'? (SWE)
SI/Nilsson: The kneepoint for SWE varies by frequency so the kneepoint and the slope of the expansion function attempt to track the long term spectrum of speech. So in essence, we have higher kneepoints in the lower frequencies, and lower kneepoints in the higher frequencies to exploit the idea of enhancing soft speech in low noise environments.
AO/Beck: How would expansion circuits handle the 60 cycle humming which comes from the refrigerator and drives many patients out of their minds!
SI/Nilsson: Depending on the level of the hum, the low frequency kneepoint could potentially be adjusted to reduce the 60 cycle hum from the fridge. Of course this will vary based on the room acoustics and other factors. Nonetheless, that's a good example of one venue in which expansion circuits help reduce annoying quiet sounds which a WDRC circuit may potentially overamplify.
AO/Beck: What are the lowest kneepoints for the Sonic Innovations device?
SI/Nilsson: It varies. In the low frequencies the kneepoint ranges from 40 to 50 dB SPL; in the high frequencies the kneepoint varies from 20 to 30 dB SPL.
AO/Beck: When the nine channel SI device is operating, what does it cue on in order to make decisions about which sounds to pass and which sounds to attenuate? Does it look at temporal, spectral and amplitude characteristics?
SI/Nilsson: The SI device mainly cues on amplitude characteristics. The SI device calculates the envelope of the signal and operates on that information. The SI does not do an FFT. The device actually does a cascaded band-pass filtering of the signal and the input envelope across each band to determine the appropriate gain and output for that given band.
AO/Beck: In your experience and in your opinion, can you cite research which definitively proves that DSP (digital signal processing) technology is clearly superior to digitally controlled analog (DCA) technology?
SI/Nilsson: The digital advantage is still difficult to prove. There are no conclusive papers out there which categorically support the digital advantage. In many respects I think the issue is likely related to the tools and methods we use to measure success, as well as our expectation of why 'digital' is better. For instance, many of the studies look at DSP versus DCA but the DSP instrument is in fact using the same (or very similar) processing protocol as the DCA. That is, the DSP instrument is essentially a digital version of an analog process, and of course we expect it should perform about the same. The technology is certainly higher in DSP, but both processors are very good and a similar result is established. Only when we begin to compare DCA to devices with signal processing strategies that are only possible with DSP (like nine independent channels with expansion and compression, noise reduction and other features), can we expect to show a digital advantage. That is, as we move forward, we'll break away from the traditional fitting protocols and only the DSP instruments will be able to control and offer the various features. So, I think the digital advantage will soon be readily apparent.
AO/Beck: To wrap all of this up - What is the best way to measure the digital advantage of DSP instruments in a clinically meaningful manner?
SI/Nilsson: Probably among the better tools is the HINT. We should expect a better clinical result in difficult signal-to-noise ratios with novel DSP algorithms, and that's where the HINT allows us to go. The HINT allows us to define a range of performance (from most difficult to easiest noise conditions) which is more useful and more externally valid. Remember, one of the potential advantages of the new DSP algorithms is better speech understanding in noise.
AO/Beck: Michael, can you tell me anything regarding how the noise reduction system in your product works?
SI/Nilsson: In most models, a programming button is available to the user, allowing noise reduction in one program and no noise reduction in the other. Or, you could set one program to have low noise reduction and the other might have maximum noise reduction. We offer three levels of noise reduction, with three processes impacting the gain and temporal characteristics of the circuit. The noise detector takes 2.5 seconds to determine if a signal is noise or not, the level of noise reduction is adjusted based upon the calculated signal to noise ratio, and the AGC characteristics are maintained as when no noise reduction is present. This combination of processing leads to a measured attack and release time slightly longer than without noise reduction (approximately 40 msecs at 2000 Hz). Our processor measures the loudness of the noise and calculates a signal-to-noise ratio based upon the noise level relative to the overall level, and it automatically adjusts the amount of noise reduction based on this signal-to-noise ratio. In other words, for a high signal-to-noise ratio there is very little noise reduction while in a difficult signal-to-noise ratio, maximum noise reduction is applied. This variation in amount of noise reduction tracks the peaks and valleys of the signal envelope, thereby reducing the level of the noise between syllables.
AO/Beck: What is the system cueing on to determine what is noise and what is not noise?
SI/Nilsson: In essence, the system is looking at the modulation rate of the signal envelope. You might say it tries to answer the questions 'What is the noise' and 'What is the signal' and determine the signal-to-noise ratio. If the modulation rate of the envelope is lower, it is more likely that the sound is a non-desired noise and is targeted as noise. So the noise reduction system attempts to target steady-state noises based on amplitude characteristics.
AO/Beck: Thanks Michael, it has been a pleasure. You have brought up many issues and I think we may need to do a follow-up interview to further investigate these issues.
SI/Nilsson: Thanks Doug, I'd be happy to do that.
For more information on Sonic Innovations click here.
Click here to visit the Sonic Innovations website.