» Articles » PMID: 8445120

Toward the Simulation of Emotion in Synthetic Speech: a Review of the Literature on Human Vocal Emotion

Overview
Journal J Acoust Soc Am
Date 1993 Feb 1
PMID 8445120
Citations 66
Authors
Affiliations
Soon will be listed here.
Abstract

There has been considerable research into perceptible correlates of emotional state, but a very limited amount of the literature examines the acoustic correlates and other relevant aspects of emotion effects in human speech; in addition, the vocal emotion literature is almost totally separate from the main body of speech analysis literature. A discussion of the literature describing human vocal emotion, and its principal findings, are presented. The voice parameters affected by emotion are found to be of three main types: voice quality, utterance timing, and utterance pitch contour. These parameters are described both in general and in detail for a range of specific emotions. Current speech synthesizer technology is such that many of the parameters of human speech affected by emotion could be manipulated systematically in synthetic speech to produce a simulation of vocal emotion; application of the literature to construction of a system capable of producing synthetic speech with emotion is discussed.

Citing Articles

Continuous monitoring of temporal skills during long-term in-home training by cochlear implant users.

Szymanski K, Gawryluk K, Brancewicz M Heliyon. 2025; 11(3):e41817.

PMID: 39975818 PMC: 11835561. DOI: 10.1016/j.heliyon.2025.e41817.


Effects of age and hearing loss on speech emotion discrimination.

Irino T, Hanatani Y, Kishida K, Naito S, Kawahara H Sci Rep. 2024; 14(1):18328.

PMID: 39112612 PMC: 11306396. DOI: 10.1038/s41598-024-69216-7.


The speech neuroprosthesis.

Silva A, Littlejohn K, Liu J, Moses D, Chang E Nat Rev Neurosci. 2024; 25(7):473-492.

PMID: 38745103 PMC: 11540306. DOI: 10.1038/s41583-024-00819-9.


Acoustic analysis of clients' expression of self-compassion, self-criticism, and self-protection within emotion focused therapy video sessions.

Bailey G, Halamova J, Vrablova V Front Psychol. 2024; 15:1363988.

PMID: 38716282 PMC: 11074451. DOI: 10.3389/fpsyg.2024.1363988.


A Survey of Deep Learning-Based Multimodal Emotion Recognition: Speech, Text, and Face.

Lian H, Lu C, Li S, Zhao Y, Tang C, Zong Y Entropy (Basel). 2023; 25(10).

PMID: 37895561 PMC: 10606253. DOI: 10.3390/e25101440.