Examining the Readability of AtlasGPT, the Premiere Resource for Neurosurgical Education

Overview

Journal World Neurosurg

Publisher Elsevier

Date 2024 Nov 22

PMID 39577655

Authors

Raj Swaroop Lavadi

Ben Carnovale

Zayaan Tirmizi

Avi A Gajjar

Rohit Prem Kumar

Manan J Shah

D Kojo Hamilton

Nitin Agarwal

Affiliations

Soon will be listed here.

Abstract

Background: AtlasGPT represents an innovative generative pretrained transformer, trained using neurosurgery literature. Its ability to construct its response according to the training level of the user is unique; however, whether its responses can be comprehended at each user's training level remains unknown. This study aimed to analyze the readability of responses provided by AtlasGPT.

Methods: Ten queries were presented to AtlasGPT across its 4 user profiles (i.e., surgeon, resident, medical student, patient). A readability analysis was performed using multiple instruments on Readability Studio. Readability scores of user-specific responses were compared using one-way analysis of variance testing and post hoc pairwise t-tests with Bonferroni correction. P value <0.05 was considered to be significant.

Results: Across the readability instruments that were leveraged, significant differences in reading ease were observed across all user profiles on comparisons to the patient (P < 0.005). Readability scores for the medical student profile tended to show greater reading ease than the surgeon and resident profiles; these differences, however, were not significant. The mean grade levels for patient responses across multiple instruments ranged from 8.8 to 11.51. Only one output via the New Dale-Chall assessment was written at the level of fifth-sixth grade.

Conclusions: AtlasGPT-generated content demonstrates readability variations according to the user profile selected; however, the readability of patient content still exceeds recommendations set by United States departmental agencies, necessitating a call to action.