Multimodal Deep Learning for Dementia Classification Using Text and Audio

Overview

Journal Sci Rep

Specialty Science

Date 2024 Jun 16

PMID 38880810

Authors

Kaiying Lin

Peter Y Washington

Affiliations

Soon will be listed here.

Abstract

Dementia is a progressive neurological disorder that affects the daily lives of older adults, impacting their verbal communication and cognitive function. Early diagnosis is important to enhance the lifespan and quality of life for affected individuals. Despite its importance, diagnosing dementia is a complex process. Automated machine learning solutions involving multiple types of data have the potential to improve the process of automated dementia screening. In this study, we build deep learning models to classify dementia cases from controls using the Pitt Cookie Theft dataset from DementiaBank, a database of short participant responses to the structured task of describing a picture of a cookie theft. We fine-tune Wav2vec and Word2vec baseline models to make binary predictions of dementia from audio recordings and text transcripts, respectively. We conduct experiments with four versions of the dataset: (1) the original data, (2) the data with short sentences removed, (3) text-based augmentation of the original data, and (4) text-based augmentation of the data with short sentences removed. Our results indicate that synonym-based text data augmentation generally enhances the performance of models that incorporate the text modality. Without data augmentation, models using the text modality achieve around 60% accuracy and 70% AUROC scores, and with data augmentation, the models achieve around 80% accuracy and 90% AUROC scores. We do not observe significant improvements in performance with the addition of audio or timestamp information into the model. We include a qualitative error analysis of the sentences that are misclassified under each study condition. This study provides preliminary insights into the effects of both text-based data augmentation and multimodal deep learning for automated dementia classification.

References

Kumar M, Vekkot S, Lalitha S, Gupta D, Govindraj V, Shaukat K . Dementia Detection from Speech Using Machine Learning and Deep Learning Architectures. Sensors (Basel). 2022; 22(23). PMC: 9740675. DOI: 10.3390/s22239311. View

Kalantarian H, Jedoui K, Washington P, Tariq Q, Dunlap K, Schwartz J . Labeling images with facial emotion and the potential for pediatric healthcare. Artif Intell Med. 2019; 98:77-86. PMC: 6855300. DOI: 10.1016/j.artmed.2019.06.004. View

Arvanitakis Z, Shah R, Bennett D . Diagnosis and Management of Dementia: Review. JAMA. 2019; 322(16):1589-1599. PMC: 7462122. DOI: 10.1001/jama.2019.4782. View

Kalantarian H, Jedoui K, Washington P, Wall D . A Mobile Game for Automatic Emotion-Labeling of Images. IEEE Trans Games. 2020; 12(2):213-218. PMC: 7301713. DOI: 10.1109/tg.2018.2877325. View

Chlasta K, Wolk K . Towards Computer-Based Automated Screening of Dementia Through Spontaneous Speech. Front Psychol. 2021; 11:623237. PMC: 7907518. DOI: 10.3389/fpsyg.2020.623237. View

Luz S, Haider F, de la Fuente Garcia S, Fromm D, MacWhinney B . Editorial: Alzheimer's Dementia Recognition through Spontaneous Speech. Front Comput Sci. 2022; 3. PMC: 8920352. DOI: 10.3389/fcomp.2021.780169. View

Whelan R, Barbey F, Cominetti M, Gillan C, Rosicka A . Developments in scalable strategies for detecting early markers of cognitive decline. Transl Psychiatry. 2022; 12(1):473. PMC: 9645320. DOI: 10.1038/s41398-022-02237-w. View

Li R, Wang X, Lawler K, Garg S, Bai Q, Alty J . Applications of artificial intelligence to aid early detection of dementia: A scoping review on current capabilities and future directions. J Biomed Inform. 2022; 127:104030. DOI: 10.1016/j.jbi.2022.104030. View

Washington P, Kalantarian H, Kent J, Husic A, Kline A, Leblanc E . Improved Digital Therapy for Developmental Pediatrics Using Domain-Specific Artificial Intelligence: Machine Learning Study. JMIR Pediatr Parent. 2022; 5(2):e26760. PMC: 9034430. DOI: 10.2196/26760. View

10.

Wright L, De Marco M, Venneri A . Current Understanding of Verbal Fluency in Alzheimer's Disease: Evidence to Date. Psychol Res Behav Manag. 2023; 16:1691-1705. PMC: 10167999. DOI: 10.2147/PRBM.S284645. View

11.

Chi N, Washington P, Kline A, Husic A, Hou C, He C . Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study. JMIR Pediatr Parent. 2022; 5(2):e35406. PMC: 9052034. DOI: 10.2196/35406. View

12.

Zhu Y, Obyat A, Liang X, Batsis J, Roth R . WavBERT: Exploiting Semantic and Non-semantic Speech using Wav2vec and BERT for Dementia Detection. Interspeech. 2023; 2021:3790-3794. PMC: 10102979. DOI: 10.21437/interspeech.2021-332. View

13.

McConathy J, Sheline Y . Imaging biomarkers associated with cognitive decline: a review. Biol Psychiatry. 2014; 77(8):685-92. PMC: 4362908. DOI: 10.1016/j.biopsych.2014.08.024. View

14.

Lanzi A, Saylor A, Fromm D, Liu H, MacWhinney B, Cohen M . DementiaBank: Theoretical Rationale, Protocol, and Illustrative Analyses. Am J Speech Lang Pathol. 2023; 32(2):426-438. PMC: 10171844. DOI: 10.1044/2022_AJSLP-22-00281. View

15.

Kalantarian H, Washington P, Schwartz J, Daniels J, Haber N, Wall D . Guess What?: Towards Understanding Autism from Structured Video Using Facial Affect. J Healthc Inform Res. 2020; 3:43-66. PMC: 7730314. DOI: 10.1007/s41666-018-0034-9. View

16.

Becker J, Boller F, Lopez O, Saxton J, McGonigle K . The natural history of Alzheimer's disease. Description of study cohort and accuracy of diagnosis. Arch Neurol. 1994; 51(6):585-94. DOI: 10.1001/archneur.1994.00540180063015. View