» Articles » PMID: 40080671

Evaluating if ChatGPT Can Answer Common Patient Questions Compared With OrthoInfo Regarding Rotator Cuff Tears

Overview
Specialty Orthopedics
Date 2025 Mar 13
PMID 40080671
Authors
Affiliations
Soon will be listed here.
Abstract

Purpose: To evaluate ChatGPT's (OpenAI) ability to provide accurate, appropriate, and readable responses to common patient questions about rotator cuff tears.

Methods: Eight questions from the OrthoInfo rotator cuff tear web page were input into ChatGPT at two levels: standard and at a sixth-grade reading level. Five orthopaedic surgeons assessed the accuracy and appropriateness of responses using a Likert scale, and the Flesch-Kincaid Grade Level measured readability. Results were analyzed with a paired Student t-test.

Results: Standard ChatGPT responses scored higher in accuracy (4.7 ± 0.47 vs. 3.6 ± 0.76; P < 0.001) and appropriateness (4.5 ± 0.57 vs. 3.7 ± 0.98; P < 0.001) compared with sixth-grade responses. However, standard ChatGPT responses were less accurate (4.7 ± 0.47 vs. 5.0 ± 0.0; P = 0.004) and appropriate (4.5 ± 0.57 vs. 5.0 ± 0.0; P = 0.016) when compared with OrthoInfo responses. OrthoInfo responses were also notably better than sixth-grade responses in both accuracy and appropriateness (P < 0.001). Standard responses had a higher Flesch-Kincaid grade level compared with both OrthoInfo and sixth-grade responses (P < 0.001).

Conclusion: Standard ChatGPT responses were less accurate and appropriate, with worse readability compared with OrthoInfo responses. Despite being easier to read, sixth-grade level ChatGPT responses compromised on accuracy and appropriateness. At this time, ChatGPT is not recommended as a standalone source for patient information on rotator cuff tears but may supplement information provided by orthopaedic surgeons.