» Articles » PMID: 39925539

Evaluating the Quality and Readability of Information Provided by Generative Artificial Intelligence Chatbots on Clavicle Fracture Treatment Options

Overview
Journal Cureus
Date 2025 Feb 10
PMID 39925539
Authors
Affiliations
Soon will be listed here.
Abstract

Introduction Generative artificial intelligence (AI) chatbots, like ChatGPT, have become more competent and prevalent, making their role in patient education more salient. This study aimed to compare the educational utility of six AI chatbots by quantifying the readability and quality of their answers to common patient questions about clavicle fracture management. Methods ChatGPT 4, ChatGPT 4o, Gemini 1.0, Gemini 1.5 Pro, Microsoft Copilot, and Perplexity were used with no prior training. Ten representative patient questions about clavicle fractures were posed to each model. The readability of AI responses was measured using Flesch-Kincaid Reading Grade Level, Gunning Fog, and Simple Measure of Gobbledygook (SMOG). Six orthopedists blindly graded the response quality of each model using the DISCERN criteria. Both metrics were analyzed via the Kruskal-Wallis test. Results No statistically significant difference was found among the readability of the six models. Microsoft Copilot (70.33±7.74) and Perplexity (71.83±7.57) demonstrated statistically significant higher DISCERN scores than ChatGPT 4 (56.67±7.15) and Gemini 1.5 Pro (51.00±8.94) with similar findings seen between Gemini 1.0 (68.00±6.42) and Gemini 1.5 Pro. The mean overall quality (question 16, DISCERN) of each model was rated at or above average (range, 3-4.4). Conclusion The findings suggest generative AI models have the capability to serve as supplementary patient education materials. With equal readability and overall high quality, Microsoft Copilot and Perplexity may be implicated as chatbots with the most educational utility regarding surgical intervention for clavicle fractures.

References
1.
Ten Have I, van den Bekerom M, van Deurzen D, Hageman M . Role of decision aids in orthopaedic surgery. World J Orthop. 2015; 6(11):864-6. PMC: 4686433. DOI: 10.5312/wjo.v6.i11.864. View

2.
Lee Y, Tessier L, Brar K, Malone S, Jin D, McKechnie T . Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in the American Society for Metabolic and Bariatric Surgery textbook of bariatric surgery questions. Surg Obes Relat Dis. 2024; 20(7):609-613. DOI: 10.1016/j.soard.2024.04.014. View

3.
Kaarre J, Feldt R, Zsidai B, Hamrin Senorski E, Rydberg E, Wolf O . ChatGPT can yield valuable responses in the context of orthopaedic trauma surgery. J Exp Orthop. 2024; 11(3):e12047. PMC: 11180970. DOI: 10.1002/jeo2.12047. View

4.
Wolf S, Chitnis A, Manoranjith A, Vanderkarr M, Plaza J, Gador L . Surgical treatment, complications, reoperations, and healthcare costs among patients with clavicle fracture in England. BMC Musculoskelet Disord. 2022; 23(1):135. PMC: 8830003. DOI: 10.1186/s12891-022-05075-5. View

5.
Johns W, Martinazzi B, Miltenberg B, Nam H, Hammoud S . ChatGPT Provides Unsatisfactory Responses to Frequently Asked Questions Regarding Anterior Cruciate Ligament Reconstruction. Arthroscopy. 2024; 40(7):2067-2079.e1. DOI: 10.1016/j.arthro.2024.01.017. View