» Articles » PMID: 39238788

Assessing ChatGPT4 with and Without Retrieval-augmented Generation in Anticoagulation Management for Gastrointestinal Procedures

Overview
Specialty Gastroenterology
Date 2024 Sep 6
PMID 39238788
Authors
Affiliations
Soon will be listed here.
Abstract

Background: In view of the growing complexity of managing anticoagulation for patients undergoing gastrointestinal (GI) procedures, this study evaluated ChatGPT-4's ability to provide accurate medical guidance, comparing it with its prior artificial intelligence (AI) models (ChatGPT-3.5) and the retrieval-augmented generation (RAG)-supported model (ChatGPT4-RAG).

Methods: Thirty-six anticoagulation-related questions, based on professional guidelines, were answered by ChatGPT-4. Nine gastroenterologists assessed these responses for accuracy and relevance. ChatGPT-4's performance was also compared to that of ChatGPT-3.5 and ChatGPT4-RAG. Additionally, a survey was conducted to understand gastroenterologists' perceptions of ChatGPT-4.

Results: ChatGPT-4's responses showed significantly better accuracy and coherence compared to ChatGPT-3.5, with 30.5% of responses fully accurate and 47.2% generally accurate. ChatGPT4-RAG demonstrated a higher ability to integrate current information, achieving 75% full accuracy. Notably, for diagnostic and therapeutic esophagogastroduodenoscopy, 51.8% of responses were fully accurate; for endoscopic retrograde cholangiopancreatography with and without stent placement, 42.8% were fully accurate; and for diagnostic and therapeutic colonoscopy, 50% were fully accurate.

Conclusions: ChatGPT4-RAG significantly advances anticoagulation management in endoscopic procedures, offering reliable and precise medical guidance. However, medicolegal considerations mean that a 75% full accuracy rate remains inadequate for independent clinical decision-making. AI may be more appropriately utilized to support and confirm clinicians' decisions, rather than replace them. Further evaluation is essential to maintain patient confidentiality and the integrity of the physician-patient relationship.

References
1.
Klang E, Sourosh A, Nadkarni G, Sharif K, Lahat A . Evaluating the role of ChatGPT in gastroenterology: a comprehensive systematic review of applications, benefits, and limitations. Therap Adv Gastroenterol. 2023; 16:17562848231218618. PMC: 10750546. DOI: 10.1177/17562848231218618. View

2.
Lee P, Bubeck S, Petro J . Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med. 2023; 388(13):1233-1239. DOI: 10.1056/NEJMsr2214184. View

3.
Mesko B, Topol E . The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med. 2023; 6(1):120. PMC: 10326069. DOI: 10.1038/s41746-023-00873-0. View

4.
Patil N, Huang R, van der Pol C, Larocque N . Comparative Performance of ChatGPT and Bard in a Text-Based Radiology Knowledge Assessment. Can Assoc Radiol J. 2023; 75(2):344-350. DOI: 10.1177/08465371231193716. View

5.
Tariq R, Malik S, Khanna S . Evolving Landscape of Large Language Models: An Evaluation of ChatGPT and Bard in Answering Patient Queries on Colonoscopy. Gastroenterology. 2023; 166(1):220-221. DOI: 10.1053/j.gastro.2023.08.033. View