Vocal Quality Factors: Analysis, Synthesis, and Perception
Overview
Authors
Affiliations
The purpose of this study was to examine several factors of vocal quality that might be affected by changes in vocal fold vibratory patterns. Four voice types were examined: modal, vocal fry, falsetto, and breathy. Three categories of analysis techniques were developed to extract source-related features from speech and electroglottographic (EGG) signals. Four factors were found to be important for characterizing the glottal excitations for the four voice types: the glottal pulse width, the glottal pulse skewness, the abruptness of glottal closure, and the turbulent noise component. The significance of these factors for voice synthesis was studied and a new voice source model that accounted for certain physiological aspects of vocal fold motion was developed and tested using speech synthesis. Perceptual listening tests were conducted to evaluate the auditory effects of the source model parameters upon synthesized speech. The effects of the spectral slope of the source excitation, the shape of the glottal excitation pulse, and the characteristics of the turbulent noise source were considered. Applications for these research results include synthesis of natural sounding speech, synthesis and modeling of vocal disorders, and the development of speaker independent (or adaptive) speech recognition systems.
Shen J, Heller Murray E Ear Hear. 2024; 46(2):474-482.
PMID: 39494949 PMC: 11832343. DOI: 10.1097/AUD.0000000000001599.
Khoshouei M, Bagherpour R, Yari M Sci Rep. 2024; 14(1):19766.
PMID: 39187574 PMC: 11347611. DOI: 10.1038/s41598-024-70717-8.
Ghasemzadeh H, Hillman R, Mehta D J Speech Lang Hear Res. 2024; 67(7):1997-2020.
PMID: 38861454 PMC: 11253796. DOI: 10.1044/2024_JSLHR-23-00515.
Pragmatic De-Noising of Electroglottographic Signals.
Ternstrom S Bioengineering (Basel). 2024; 11(5).
PMID: 38790346 PMC: 11117636. DOI: 10.3390/bioengineering11050479.
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning.
Liang P, Lyu Y, Fan X, Wu Z, Cheng Y, Wu J Adv Neural Inf Process Syst. 2024; 2021(DB1):1-20.
PMID: 38774625 PMC: 11106632.