Ethical and Professional Decision-Making Capabilities of Artificial Intelligence Chatbots: Evaluating ChatGPT's Professional Competencies in Medicine

Overview

Journal Med Sci Educ

Publisher Springer

Specialty Medical Education

Date 2024 Apr 30

PMID 38686158

Authors

John C Lin

Sai S Kurapati

David N Younessi

Ingrid U Scott

Dan A Gong

Affiliations

Soon will be listed here.

Abstract

Purpose: We examined the performance of artificial intelligence chatbots on the PREview Practice Exam, an online situational judgment test for professionalism and ethics.

Methods: We used validated methodologies to calculate scores and descriptive statistics, tests, and Fisher's exact tests to compare scores by model and competency.

Results: GPT-3.5 and GPT-4 scored 6/9 (76th percentile) and 7/9 (92nd percentile), respectively, higher than medical school applicant averages of 5/9 (56th percentile). Both models answered 95 + % of questions correctly.

Conclusions: Chatbots outperformed the average applicant on PREview, suggesting their potential for healthcare training and decision-making and highlighting risks of online assessment delivery.

References

Husbands A, Rodgerson M, Dowell J, Patterson F . Evaluating the validity of an integrity-based situational judgement test for medical school admissions. BMC Med Educ. 2015; 15:144. PMC: 4557748. DOI: 10.1186/s12909-015-0424-0. View

Webster E, Paton L, Crampton P, Tiffin P . Situational judgement test validity for selection: A systematic review and meta-analysis. Med Educ. 2020; 54(10):888-902. DOI: 10.1111/medu.14201. View