» Articles » PMID: 39221085

Digital Health Tools in Nephrology: A Comparative Analysis of AI and Professional Opinions Via Online Polls

Overview
Journal Digit Health
Date 2024 Sep 2
PMID 39221085
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Professional opinion polling has become a popular means of seeking advice for complex nephrology questions in the #AskRenal community on X. ChatGPT is a large language model with remarkable problem-solving capabilities, but its ability to provide solutions for real-world clinical scenarios remains unproven. This study seeks to evaluate how closely ChatGPT's responses align with current prevailing medical opinions in nephrology.

Methods: Nephrology polls from X were submitted to ChatGPT-4, which generated answers without prior knowledge of the poll outcomes. Its responses were compared to the poll results (inter-rater) and a second set of responses given after a one-week interval (intra-rater) using Cohen's kappa statistic (κ). Subgroup analysis was performed based on question subject matter.

Results: Our analysis comprised two rounds of testing ChatGPT on 271 nephrology-related questions. In the first round, ChatGPT's responses agreed with poll results for 163 of the 271 questions (60.2%; κ = 0.42, 95% CI: 0.38-0.46). In the second round, conducted to assess reproducibility, agreement improved slightly to 171 out of 271 questions (63.1%; κ = 0.46, 95% CI: 0.42-0.50). Comparison of ChatGPT's responses between the two rounds demonstrated high internal consistency, with agreement in 245 out of 271 responses (90.4%; κ = 0.86, 95% CI: 0.82-0.90). Subgroup analysis revealed stronger performance in the combined areas of homeostasis, nephrolithiasis, and pharmacology (κ = 0.53, 95% CI: 0.47-0.59 in both rounds), compared to other nephrology subfields.

Conclusion: ChatGPT-4 demonstrates modest capability in replicating prevailing professional opinion in nephrology polls overall, with varying performance levels between question topics and excellent internal consistency. This study provides insights into the potential and limitations of using ChatGPT in medical decision making.

References
1.
Yoo D, Divard G, Raynaud M, Cohen A, Mone T, Rosenthal J . A Machine Learning-Driven Virtual Biopsy System For Kidney Transplant Patients. Nat Commun. 2024; 15(1):554. PMC: 10791605. DOI: 10.1038/s41467-023-44595-z. View

2.
Apel C, Hornig C, Maddux F, Ketchersid T, Yeung J, Guinsburg A . Informed decision-making in delivery of dialysis: combining clinical outcomes with sustainability. Clin Kidney J. 2022; 14(Suppl 4):i98-i113. PMC: 8711764. DOI: 10.1093/ckj/sfab193. View

3.
De Angelis L, Baglivo F, Arzilli G, Privitera G, Ferragina P, Tozzi A . ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health. Front Public Health. 2023; 11:1166120. PMC: 10166793. DOI: 10.3389/fpubh.2023.1166120. View

4.
. Global, regional, and national burden of chronic kidney disease, 1990-2017: a systematic analysis for the Global Burden of Disease Study 2017. Lancet. 2020; 395(10225):709-733. PMC: 7049905. DOI: 10.1016/S0140-6736(20)30045-3. View

5.
Pereira V, Silva S, Carvalho V, Zanghelini F, Barreto J . Strategies for the implementation of clinical practice guidelines in public health: an overview of systematic reviews. Health Res Policy Syst. 2022; 20(1):13. PMC: 8785489. DOI: 10.1186/s12961-022-00815-4. View