» Articles » PMID: 39768071

Sensitivity of Acoustic Voice Quality Measures in Simulated Reverberation Conditions

Overview
Date 2025 Jan 8
PMID 39768071
Authors
Affiliations
Soon will be listed here.
Abstract

Room reverberation can affect oral/aural communication and is especially critical in computer analysis of voice. High levels of reverberation can distort voice recordings, impacting the accuracy of quantifying voice production quality and vocal health evaluations. This study quantifies the impact of additive simulated reverberation on otherwise clean voice recordings as reflected in voice metrics commonly used for voice quality evaluation. From a larger database of voice recordings collected in a low-noise, low-reverberation environment, voice samples of a sustained [a:] vowel produced at two different speaker intents (comfortable and clear) by five healthy voice college-age female native English speakers were used. Using the reverb effect in Audacity, eight reverberation situations indicating a range of reverberation times (T20 between 0.004 and 1.82 s) were simulated and convolved with the original recordings. All voice samples, both original and reverberation-affected, were analyzed using freely available PRAAT software (version 6.0.13) to calculate five common voice parameters: jitter, shimmer, harmonic-to-noise ratio (HNR), alpha ratio, and smoothed cepstral peak prominence (CPPs). Statistical analyses assessed the sensitivity and variations in voice metrics to a range of simulated room reverberation conditions. Results showed that jitter, HNR, and alpha ratio were stable at simulated reverberation times below T20 of 1 s, with HNR and jitter more stable in the clear vocal style. Shimmer was highly sensitive even at T20 of 0.53 s, which would reflect a common room, while CPPs remained stable across all simulated reverberation conditions. Understanding the sensitivity and stability of these voice metrics to a range of room acoustics effects allows for targeted use of certain metrics even in less controlled environments, enabling selective application of stable measures like CPPs and cautious interpretation of shimmer, ensuring more reliable and accurate voice assessments.

References
1.
Rollins M, Leishman T, Whiting J, Hunter E, Eggett D . Effects of added absorption on the vocal exertions of talkers in a reverberant room. J Acoust Soc Am. 2019; 145(2):775. PMC: 6372363. DOI: 10.1121/1.5089891. View

2.
Rodriguez-Parra M, Adrian J, Casado J . Voice therapy used to test a basic protocol for multidimensional assessment of dysphonia. J Voice. 2007; 23(3):304-18. DOI: 10.1016/j.jvoice.2007.05.001. View

3.
Deliyski D, Evans M, Shaw H . Influence of data acquisition environment on accuracy of acoustic voice quality measurements. J Voice. 2005; 19(2):176-86. DOI: 10.1016/j.jvoice.2004.07.012. View

4.
Wuyts F, De Bodt M, Molenberghs G, Remacle M, Heylen L, Millet B . The dysphonia severity index: an objective measure of vocal quality based on a multiparameter approach. J Speech Lang Hear Res. 2000; 43(3):796-809. DOI: 10.1044/jslhr.4303.796. View

5.
Herzel H, Berry D, Titze I, Saleh M . Analysis of vocal disorders with methods from nonlinear dynamics. J Speech Hear Res. 1994; 37(5):1008-19. DOI: 10.1044/jshr.3705.1008. View