"Application and Accuracy of Artificial Intelligence-derived Large Language Models in Patients with Age Related Macular Degeneration"
Overview
Affiliations
Introduction: Age-related macular degeneration (AMD) affects millions of people globally, leading to a surge in online research of putative diagnoses, causing potential misinformation and anxiety in patients and their parents. This study explores the efficacy of artificial intelligence-derived large language models (LLMs) like in addressing AMD patients' questions.
Methods: ChatGPT 3.5 (2023), Bing AI (2023), and Google Bard (2023) were adopted as LLMs. Patients' questions were subdivided in two question categories, (a) general medical advice and (b) pre- and post-intravitreal injection advice and classified as (1) accurate and sufficient (2) partially accurate but sufficient and (3) inaccurate and not sufficient. Non-parametric test has been done to compare the means between the 3 LLMs scores and also an analysis of variance and reliability tests were performed among the 3 groups.
Results: In category a) of questions, the average score was 1.20 (± 0.41) with ChatGPT 3.5, 1.60 (± 0.63) with Bing AI and 1.60 (± 0.73) with Google Bard, showing no significant differences among the 3 groups (p = 0.129). The average score in category b was 1.07 (± 0.27) with ChatGPT 3.5, 1.69 (± 0.63) with Bing AI and 1.38 (± 0.63) with Google Bard, showing a significant difference among the 3 groups (p = 0.0042). Reliability statistics showed Chronbach's α of 0.237 (range 0.448, 0.096-0.544).
Conclusion: ChatGPT 3.5 consistently offered the most accurate and satisfactory responses, particularly with technical queries. While LLMs displayed promise in providing precise information about AMD; however, further improvements are needed especially in more technical questions.
Yang Z, Tian D, Zhao X, Zhang L, Xu Y, Lu X Quant Imaging Med Surg. 2025; 15(1):813-830.
PMID: 39839014 PMC: 11744182. DOI: 10.21037/qims-24-1406.
Discriminative, generative artificial intelligence, and foundation models in retina imaging.
Ruamviboonsuk P, Arjkongharn N, Vongsa N, Pakaymaskul P, Kaothanthong N Taiwan J Ophthalmol. 2025; 14(4):473-485.
PMID: 39803410 PMC: 11717344. DOI: 10.4103/tjo.TJO-D-24-00064.
Opportunities and Challenges of Chatbots in Ophthalmology: A Narrative Review.
Sabaner M, Anguita R, Antaki F, Balas M, Boberg-Ans L, Desideri L J Pers Med. 2024; 14(12).
PMID: 39728077 PMC: 11678018. DOI: 10.3390/jpm14121165.
Bellanda V, Dos Santos M, Ferraz D, Jorge R, Melo G Int J Retina Vitreous. 2024; 10(1):79.
PMID: 39420407 PMC: 11487877. DOI: 10.1186/s40942-024-00595-9.
The digital age in retinal practice.
Anguita R, Desideri L, Loewenstein A, Zinkernagel M Int J Retina Vitreous. 2024; 10(1):67.
PMID: 39327591 PMC: 11425878. DOI: 10.1186/s40942-024-00580-2.