» Articles » PMID: 37356806

Evaluating GPT As an Adjunct for Radiologic Decision Making: GPT-4 Versus GPT-3.5 in a Breast Imaging Pilot

Overview
Publisher Elsevier
Specialty Radiology
Date 2023 Jun 25
PMID 37356806
Authors
Affiliations
Soon will be listed here.
Abstract

Objective: Despite rising popularity and performance, studies evaluating the use of large language models for clinical decision support are lacking. Here, we evaluate ChatGPT (Generative Pre-trained Transformer)-3.5 and GPT-4's (OpenAI, San Francisco, California) capacity for clinical decision support in radiology via the identification of appropriate imaging services for two important clinical presentations: breast cancer screening and breast pain.

Methods: We compared ChatGPT's responses to the ACR Appropriateness Criteria for breast pain and breast cancer screening. Our prompt formats included an open-ended (OE) and a select all that apply (SATA) format. Scoring criteria evaluated whether proposed imaging modalities were in accordance with ACR guidelines. Three replicate entries were conducted for each prompt, and the average of these was used to determine final scores.

Results: Both ChatGPT-3.5 and ChatGPT-4 achieved an average OE score of 1.830 (out of 2) for breast cancer screening prompts. ChatGPT-3.5 achieved a SATA average percentage correct of 88.9%, compared with ChatGPT-4's average percentage correct of 98.4% for breast cancer screening prompts. For breast pain, ChatGPT-3.5 achieved an average OE score of 1.125 (out of 2) and a SATA average percentage correct of 58.3%, as compared with an average OE score of 1.666 (out of 2) and a SATA average percentage correct of 77.7%.

Discussion: Our results demonstrate the eventual feasibility of using large language models like ChatGPT for radiologic decision making, with the potential to improve clinical workflow and responsible use of radiology services. More use cases and greater accuracy are necessary to evaluate and implement such tools.

Citing Articles

Using mathematical modelling and AI to improve delivery and efficacy of therapies in cancer.

Harkos C, Hadjigeorgiou A, Voutouri C, Kumar A, Stylianopoulos T, Jain R Nat Rev Cancer. 2025; .

PMID: 39972158 DOI: 10.1038/s41568-025-00796-w.


Radiology Report Annotation Using Generative Large Language Models: Comparative Analysis.

Altalla B, Ahmad A, Bitar L, Al-Bssol M, Omari A, Sultan I Int J Biomed Imaging. 2025; 2025:5019035.

PMID: 39968311 PMC: 11835477. DOI: 10.1155/ijbi/5019035.


Can ChatGPT and Gemini justify brain CT referrals? A comparative study with human experts and a custom prediction model.

Potocnik J, Thomas E, Kearney D, Killeen R, Heffernan E, Foley S Eur Radiol Exp. 2025; 9(1):24.

PMID: 39966263 PMC: 11836243. DOI: 10.1186/s41747-025-00569-y.


Transforming dental diagnostics with artificial intelligence: advanced integration of ChatGPT and large language models for patient care.

Farhadi Nia M, Ahmadi M, Irankhah E Front Dent Med. 2025; 5:1456208.

PMID: 39917691 PMC: 11797834. DOI: 10.3389/fdmed.2024.1456208.


Large Language Models for Chatbot Health Advice Studies: A Systematic Review.

Huo B, Boyle A, Marfo N, Tangamornsuksan W, Steen J, McKechnie T JAMA Netw Open. 2025; 8(2):e2457879.

PMID: 39903463 PMC: 11795331. DOI: 10.1001/jamanetworkopen.2024.57879.


References
1.
Sadigh G, Duszak Jr R, Ward K, Jiang R, Switchenko J, Applegate K . Downstream Breast Imaging Following Screening Mammography in Medicare Patients with Advanced Cancer: A Population-Based Study. J Gen Intern Med. 2017; 33(3):284-290. PMC: 5834957. DOI: 10.1007/s11606-017-4212-x. View

2.
Bizzo B, Almeida R, Michalski M, Alkasab T . Artificial Intelligence and Clinical Decision Support for Radiologists and Referring Providers. J Am Coll Radiol. 2019; 16(9 Pt B):1351-1356. DOI: 10.1016/j.jacr.2019.06.010. View

3.
Virji A, Cheloff A, Ghoshal S, Nagle B, Guo T, Lev M . Analysis of self-initiated visits for cervical trauma at urgent care centers and subsequent emergency department referral. Clin Imaging. 2022; 91:14-18. DOI: 10.1016/j.clinimag.2022.08.007. View

4.
Thorp H . ChatGPT is fun, but not an author. Science. 2023; 379(6630):313. DOI: 10.1126/science.adg7879. View

5.
Biswas S . ChatGPT and the Future of Medical Writing. Radiology. 2023; 307(2):e223312. DOI: 10.1148/radiol.223312. View