» Articles » PMID: 21909285

Reasons Why Current Speech-enhancement Algorithms Do Not Improve Speech Intelligibility and Suggested Solutions

Overview
Publisher IEEE
Date 2011 Sep 13
PMID 21909285
Citations 22
Authors
Affiliations
Soon will be listed here.
Abstract

Existing speech enhancement algorithms can improve speech quality but not speech intelligibility, and the reasons for that are unclear. In the present paper, we present a theoretical framework that can be used to analyze potential factors that can influence the intelligibility of processed speech. More specifically, this framework focuses on the fine-grain analysis of the distortions introduced by speech enhancement algorithms. It is hypothesized that if these distortions are properly controlled, then large gains in intelligibility can be achieved. To test this hypothesis, intelligibility tests are conducted with human listeners in which we present processed speech with controlled speech distortions. The aim of these tests is to assess the perceptual effect of the various distortions that can be introduced by speech enhancement algorithms on speech intelligibility. Results with three different enhancement algorithms indicated that certain distortions are more detrimental to speech intelligibility degradation than others. When these distortions were properly controlled, however, large gains in intelligibility were obtained by human listeners, even by spectral-subtractive algorithms which are known to degrade speech quality and intelligibility.

Citing Articles

Influences of noise reduction on speech intelligibility, listening effort, and sound quality among adults with severe to profound hearing loss.

Dong R, Liu P, Tian X, Wang Y, Chen Y, Zhang J Front Neurosci. 2024; 18:1407775.

PMID: 39108313 PMC: 11301946. DOI: 10.3389/fnins.2024.1407775.


Sixty Years of Frequency-Domain Monaural Speech Enhancement: From Traditional to Deep Learning Methods.

Zheng C, Zhang H, Liu W, Luo X, Li A, Li X Trends Hear. 2023; 27:23312165231209913.

PMID: 37956661 PMC: 10658184. DOI: 10.1177/23312165231209913.


Individual Listener Preference for Strength of Single-Microphone Noise-Reduction; Trade-off Between Noise Tolerance and Signal Distortion Tolerance.

Reinten I, de Ronde-Brons I, Houben R, Dreschler W Trends Hear. 2023; 27:23312165231192304.

PMID: 37525630 PMC: 10395179. DOI: 10.1177/23312165231192304.


Enhancement of speech-in-noise comprehension through vibrotactile stimulation at the syllabic rate.

Guilleminot P, Reichenbach T Proc Natl Acad Sci U S A. 2022; 119(13):e2117000119.

PMID: 35312362 PMC: 9060510. DOI: 10.1073/pnas.2117000119.


Deep Learning-Based Speech Enhancement With a Loss Trading Off the Speech Distortion and the Noise Residue for Cochlear Implants.

Kang Y, Zheng N, Meng Q Front Med (Lausanne). 2021; 8:740123.

PMID: 34820392 PMC: 8606413. DOI: 10.3389/fmed.2021.740123.


References
1.
Rhebergen K, Versfeld N, Dreschler W . Extended speech intelligibility index for the prediction of the speech reception threshold in fluctuating noise. J Acoust Soc Am. 2007; 120(6):3988-97. DOI: 10.1121/1.2358008. View

2.
Hu Y, Loizou P . A comparative intelligibility study of single-microphone noise reduction algorithms. J Acoust Soc Am. 2007; 122(3):1777. DOI: 10.1121/1.2766778. View

3.
Ma J, Hu Y, Loizou P . Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions. J Acoust Soc Am. 2009; 125(5):3387-405. PMC: 2806444. DOI: 10.1121/1.3097493. View

4.
Brungart D, Chang P, Simpson B, Wang D . Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation. J Acoust Soc Am. 2007; 120(6):4007-18. DOI: 10.1121/1.2363929. View

5.
Hu Y, Loizou P . A new sound coding strategy for suppressing noise in cochlear implants. J Acoust Soc Am. 2008; 124(1):498-509. PMC: 2564827. DOI: 10.1121/1.2924131. View