Evaluating the Quality and Readability of Information Provided by Generative Artificial Intelligence Chatbots on Clavicle Fracture Treatment Options

Overview

Journal Cureus

Date 2025 Feb 10

PMID 39925539

Authors

Peter A Giammanco

Christopher E Collins

Jason Zimmerman

Mikayla Kricfalusi

Richard C Rice

Michael Trumbo

Bradley A Carlson

Rebecca A Rajfer

Brian A Schneiderman

Joseph G Elsissy

Affiliations

Soon will be listed here.

Abstract

Introduction Generative artificial intelligence (AI) chatbots, like ChatGPT, have become more competent and prevalent, making their role in patient education more salient. This study aimed to compare the educational utility of six AI chatbots by quantifying the readability and quality of their answers to common patient questions about clavicle fracture management. Methods ChatGPT 4, ChatGPT 4o, Gemini 1.0, Gemini 1.5 Pro, Microsoft Copilot, and Perplexity were used with no prior training. Ten representative patient questions about clavicle fractures were posed to each model. The readability of AI responses was measured using Flesch-Kincaid Reading Grade Level, Gunning Fog, and Simple Measure of Gobbledygook (SMOG). Six orthopedists blindly graded the response quality of each model using the DISCERN criteria. Both metrics were analyzed via the Kruskal-Wallis test. Results No statistically significant difference was found among the readability of the six models. Microsoft Copilot (70.33±7.74) and Perplexity (71.83±7.57) demonstrated statistically significant higher DISCERN scores than ChatGPT 4 (56.67±7.15) and Gemini 1.5 Pro (51.00±8.94) with similar findings seen between Gemini 1.0 (68.00±6.42) and Gemini 1.5 Pro. The mean overall quality (question 16, DISCERN) of each model was rated at or above average (range, 3-4.4). Conclusion The findings suggest generative AI models have the capability to serve as supplementary patient education materials. With equal readability and overall high quality, Microsoft Copilot and Perplexity may be implicated as chatbots with the most educational utility regarding surgical intervention for clavicle fractures.

References

Ten Have I, van den Bekerom M, van Deurzen D, Hageman M . Role of decision aids in orthopaedic surgery. World J Orthop. 2015; 6(11):864-6. PMC: 4686433. DOI: 10.5312/wjo.v6.i11.864. View

Lee Y, Tessier L, Brar K, Malone S, Jin D, McKechnie T . Performance of artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in the American Society for Metabolic and Bariatric Surgery textbook of bariatric surgery questions. Surg Obes Relat Dis. 2024; 20(7):609-613. DOI: 10.1016/j.soard.2024.04.014. View

Kaarre J, Feldt R, Zsidai B, Hamrin Senorski E, Rydberg E, Wolf O . ChatGPT can yield valuable responses in the context of orthopaedic trauma surgery. J Exp Orthop. 2024; 11(3):e12047. PMC: 11180970. DOI: 10.1002/jeo2.12047. View

Wolf S, Chitnis A, Manoranjith A, Vanderkarr M, Plaza J, Gador L . Surgical treatment, complications, reoperations, and healthcare costs among patients with clavicle fracture in England. BMC Musculoskelet Disord. 2022; 23(1):135. PMC: 8830003. DOI: 10.1186/s12891-022-05075-5. View

Johns W, Martinazzi B, Miltenberg B, Nam H, Hammoud S . ChatGPT Provides Unsatisfactory Responses to Frequently Asked Questions Regarding Anterior Cruciate Ligament Reconstruction. Arthroscopy. 2024; 40(7):2067-2079.e1. DOI: 10.1016/j.arthro.2024.01.017. View

Hurley E, Crook B, Lorentz S, Danilkowicz R, Lau B, Taylor D . Evaluation High-Quality of Information from ChatGPT (Artificial Intelligence-Large Language Model) Artificial Intelligence on Shoulder Stabilization Surgery. Arthroscopy. 2023; 40(3):726-731.e6. DOI: 10.1016/j.arthro.2023.07.048. View

Crook B, Park C, Hurley E, Richard M, Pidgeon T . Evaluation of Online Artificial Intelligence-Generated Information on Common Hand Procedures. J Hand Surg Am. 2023; 48(11):1122-1127. DOI: 10.1016/j.jhsa.2023.08.003. View

Fahy S, Oehme S, Milinkovic D, Jung T, Bartek B . Assessment of Quality and Readability of Information Provided by ChatGPT in Relation to Anterior Cruciate Ligament Injury. J Pers Med. 2024; 14(1). PMC: 10817257. DOI: 10.3390/jpm14010104. View

Amante D, Hogan T, Pagoto S, English T, Lapane K . Access to care and use of the Internet to search for health information: results from the US National Health Interview Survey. J Med Internet Res. 2015; 17(4):e106. PMC: 4430679. DOI: 10.2196/jmir.4126. View

10.

Seth I, Lim B, Xie Y, Cevik J, Rozen W, Ross R . Comparing the Efficacy of Large Language Models ChatGPT, BARD, and Bing AI in Providing Information on Rhinoplasty: An Observational Study. Aesthet Surg J Open Forum. 2023; 5:ojad084. PMC: 10547367. DOI: 10.1093/asjof/ojad084. View

11.

Guzman A, Dela Rueda T, Williams N, Rayos Del Sol S, Jenkins S, Shin C . Online Patient Education Resources for Anterior Cruciate Ligament Reconstruction: An Assessment of the Accuracy and Reliability of Information on the Internet Over the Past Decade. Cureus. 2023; 15(10):e46599. PMC: 10627413. DOI: 10.7759/cureus.46599. View

12.

Wright M, Della Rocca G . American Academy of Orthopaedic Surgeons Clinical Practice Guideline Summary on the Treatment of Clavicle Fractures. J Am Acad Orthop Surg. 2023; 31(18):977-983. DOI: 10.5435/JAAOS-D-23-00472. View

13.

Wu C, Chang H, Lu K . Risk factors for nonunion in 337 displaced midshaft clavicular fractures treated with Knowles pin fixation. Arch Orthop Trauma Surg. 2012; 133(1):15-22. DOI: 10.1007/s00402-012-1631-3. View

14.

Kasapovic A, Ali T, Babasiz M, Bojko J, Gathen M, Kaczmarczyk R . Does the Information Quality of ChatGPT Meet the Requirements of Orthopedics and Trauma Surgery?. Cureus. 2024; 16(5):e60318. PMC: 11177007. DOI: 10.7759/cureus.60318. View

15.

Warren Jr E, Hurley E, Park C, Crook B, Lorentz S, Levin J . Evaluation of information from artificial intelligence on rotator cuff repair surgery. JSES Int. 2024; 8(1):53-57. PMC: 10837709. DOI: 10.1016/j.jseint.2023.09.009. View

16.

Zhang D, Schumacher C, Harris M . The quality and readability of internet information regarding clavicle fractures. J Orthop Sci. 2016; 21(2):143-6. DOI: 10.1016/j.jos.2015.12.003. View

17.

Ban I, Troelsen A . Risk profile of patients developing nonunion of the clavicle and outcome of treatment--analysis of fifty five nonunions in seven hundred and twenty nine consecutive fractures. Int Orthop. 2016; 40(3):587-93. DOI: 10.1007/s00264-016-3120-8. View

18.

Kihlstrom C, Moller M, Lonn K, Wolf O . Clavicle fractures: epidemiology, classification and treatment of 2 422 fractures in the Swedish Fracture Register; an observational study. BMC Musculoskelet Disord. 2017; 18(1):82. PMC: 5312264. DOI: 10.1186/s12891-017-1444-1. View

19.

Mika A, Martin J, Engstrom S, Polkowski G, Wilson J . Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty. J Bone Joint Surg Am. 2023; 105(19):1519-1526. DOI: 10.2106/JBJS.23.00209. View

20.

Wang C, Li H, Li L, Xu D, Kane R, Meng Q . Health literacy and ethnic disparities in health-related quality of life among rural women: results from a Chinese poor minority area. Health Qual Life Outcomes. 2013; 11:153. PMC: 3847672. DOI: 10.1186/1477-7525-11-153. View