» Articles » PMID: 37572695

Artificial Intelligence Chatbot Performance in Triage of Ophthalmic Conditions

Overview
Publisher Elsevier
Specialty Ophthalmology
Date 2023 Aug 12
PMID 37572695
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Timely access to human expertise for affordable and efficient triage of ophthalmic conditions is inconsistent. With recent advancements in publicly available artificial intelligence (AI) chatbots, the lay public may turn to these tools for triage of ophthalmic complaints. Validation studies are necessary to evaluate the performance of AI chatbots as triage tools and inform the public regarding their safety.

Objective: To evaluate the triage performance of AI chatbots for ophthalmic conditions.

Design: Cross-sectional study.

Setting: Single centre.

Participants: Ophthalmology trainees, OpenAI ChatGPT (GPT-4), Bing Chat, and WebMD Symptom Checker.

Methods: Forty-four clinical vignettes representing common ophthalmic complaints were developed, and a standardized pathway of prompts was presented to each tool in March 2023. Primary outcomes were proportion of responses with the correct diagnosis listed in the top 3 possible diagnoses and proportion with correct triage urgency. Ancillary outcomes included presence of grossly inaccurate statements, mean reading grade level, mean response word count, proportion with attribution, and most common sources cited.

Results: The ophthalmologists in training, ChatGPT, Bing Chat, and the WebMD Symptom Checker listed the appropriate diagnosis among the top 3 suggestions in 42 (95%), 41 (93%), 34 (77%), and 8 (33%) cases, respectively. Triage urgency was appropriate in 38 (86%), 43 (98%), and 37 (84%) cases for ophthalmology trainees, ChatGPT, and Bing Chat, correspondingly.

Conclusions: ChatGPT using the GPT-4 model offered high diagnostic and triage accuracy that was comparable with that of ophthalmology trainees with no grossly inaccurate statements. Bing Chat had lower accuracy and a tendency to overestimate triage urgency.

Citing Articles

Large Language Models for Chatbot Health Advice Studies: A Systematic Review.

Huo B, Boyle A, Marfo N, Tangamornsuksan W, Steen J, McKechnie T JAMA Netw Open. 2025; 8(2):e2457879.

PMID: 39903463 PMC: 11795331. DOI: 10.1001/jamanetworkopen.2024.57879.


Multimodal machine learning enables AI chatbot to diagnose ophthalmic diseases and provide high-quality medical responses.

Ma R, Cheng Q, Yao J, Peng Z, Yan M, Lu J NPJ Digit Med. 2025; 8(1):64.

PMID: 39870855 PMC: 11772878. DOI: 10.1038/s41746-025-01461-0.


Current applications and challenges in large language models for patient care: a systematic review.

Busch F, Hoffmann L, Rueger C, van Dijk E, Kader R, Ortiz-Prado E Commun Med (Lond). 2025; 5(1):26.

PMID: 39838160 PMC: 11751060. DOI: 10.1038/s43856-024-00717-2.


Evaluation of the ability of large language models to self-diagnose oral diseases.

Zhuang S, Zeng Y, Lin S, Chen X, Xin Y, Li H iScience. 2025; 27(12):111495.

PMID: 39758998 PMC: 11699252. DOI: 10.1016/j.isci.2024.111495.


Opportunities and Challenges of Chatbots in Ophthalmology: A Narrative Review.

Sabaner M, Anguita R, Antaki F, Balas M, Boberg-Ans L, Desideri L J Pers Med. 2024; 14(12).

PMID: 39728077 PMC: 11678018. DOI: 10.3390/jpm14121165.