» Articles » PMID: 30821822

Comparative Accuracy of Diagnosis by Collective Intelligence of Multiple Physicians Vs Individual Physicians

Overview
Journal JAMA Netw Open
Specialty General Medicine
Date 2019 Mar 2
PMID 30821822
Citations 42
Authors
Affiliations
Soon will be listed here.
Abstract

Importance: The traditional approach of diagnosis by individual physicians has a high rate of misdiagnosis. Pooling multiple physicians' diagnoses (collective intelligence) is a promising approach to reducing misdiagnoses, but its accuracy in clinical cases is unknown to date.

Objective: To assess how the diagnostic accuracy of groups of physicians and trainees compares with the diagnostic accuracy of individual physicians.

Design, Setting, And Participants: Cross-sectional study using data from the Human Diagnosis Project (Human Dx), a multicountry data set of ranked differential diagnoses by individual physicians, graduate trainees, and medical students (users) solving user-submitted, structured clinical cases. From May 7, 2014, to October 5, 2016, groups of 2 to 9 randomly selected physicians solved individual cases. Data analysis was performed from March 16, 2017, to July 30, 2018.

Main Outcomes And Measures: The primary outcome was diagnostic accuracy, assessed as a correct diagnosis in the top 3 ranked diagnoses for an individual; for groups, the top 3 diagnoses were a collective differential generated using a weighted combination of user diagnoses with a variety of approaches. A version of the McNemar test was used to account for clustering across repeated solvers to compare diagnostic accuracy.

Results: Of the 2069 users solving 1572 cases from the Human Dx data set, 1228 (59.4%) were residents or fellows, 431 (20.8%) were attending physicians, and 410 (19.8%) were medical students. Collective intelligence was associated with increasing diagnostic accuracy, from 62.5% (95% CI, 60.1%-64.9%) for individual physicians up to 85.6% (95% CI, 83.9%-87.4%) for groups of 9 (23.0% difference; 95% CI, 14.9%-31.2%; P < .001). The range of improvement varied by the specifications used for combining groups' diagnoses, but groups consistently outperformed individuals regardless of approach. Absolute improvement in accuracy from individuals to groups of 9 varied by presenting symptom from an increase of 17.3% (95% CI, 6.4%-28.2%; P = .002) for abdominal pain to 29.8% (95% CI, 3.7%-55.8%; P = .02) for fever. Groups from 2 users (77.7% accuracy; 95% CI, 70.1%-84.6%) to 9 users (85.5% accuracy; 95% CI, 75.1%-95.9%) outperformed individual specialists in their subspecialty (66.3% accuracy; 95% CI, 59.1%-73.5%; P < .001 vs groups of 2 and 9).

Conclusions And Relevance: A collective intelligence approach was associated with higher diagnostic accuracy compared with individuals, including individual specialists whose expertise matched the case diagnosis, across a range of medical cases. Given the few proven strategies to address misdiagnosis, this technique merits further study in clinical settings.

Citing Articles

Approach to Acute Dizziness/Vertigo in the Emergency Department: Selected Controversies Regarding Specialty Consultation.

Puissant M, Giampalmo S, Wira 3rd C, Goldstein J, Newman-Toker D Stroke. 2024; 55(10):2584-2588.

PMID: 39268603 PMC: 11668186. DOI: 10.1161/STROKEAHA.123.043406.


Retrieval-Based Diagnostic Decision Support: Mixed Methods Study.

Abdullahi T, Mercurio L, Singh R, Eickhoff C JMIR Med Inform. 2024; 12:e50209.

PMID: 38896468 PMC: 11222760. DOI: 10.2196/50209.


Medical residents' perceptions of group biases in medical decision making: a qualitative study.

Choi J, Mhaimeed N, Al-Mohanadi D, Mahmoud M BMC Med Educ. 2024; 24(1):661.

PMID: 38877491 PMC: 11179270. DOI: 10.1186/s12909-024-05643-4.


Multimodal assessment improves neuroprognosis performance in clinically unresponsive critical-care patients with brain injury.

Rohaut B, Calligaris C, Hermann B, Perez P, Faugeras F, Raimondo F Nat Med. 2024; 30(8):2349-2355.

PMID: 38816609 PMC: 11333287. DOI: 10.1038/s41591-024-03019-1.


Boosting wisdom of the crowd for medical image annotation using training performance and task features.

Hasan E, Duhaime E, Trueblood J Cogn Res Princ Implic. 2024; 9(1):31.

PMID: 38763994 PMC: 11102897. DOI: 10.1186/s41235-024-00558-6.


References
1.
Beam C, Layde P, Sullivan D . Variability in the interpretation of screening mammograms by US radiologists. Findings from a national sample. Arch Intern Med. 1996; 156(2):209-13. View

2.
Singh H, Meyer A, Thomas E . The frequency of diagnostic errors in outpatient care: estimations from three large observational studies involving US adult populations. BMJ Qual Saf. 2014; 23(9):727-31. PMC: 4145460. DOI: 10.1136/bmjqs-2013-002627. View

3.
Cook N, Hicks L, OMalley A, Keegan T, Guadagnoli E, Landon B . Access to specialty care and medical services in community health centers. Health Aff (Millwood). 2007; 26(5):1459-68. DOI: 10.1377/hlthaff.26.5.1459. View

4.
Graber M . The incidence of diagnostic error in medicine. BMJ Qual Saf. 2013; 22 Suppl 2:ii21-ii27. PMC: 3786666. DOI: 10.1136/bmjqs-2012-001615. View

5.
Arrow K, Forsythe R, Gorham M, Hahn R, Hanson R, Ledyard J . Economics. The promise of prediction markets. Science. 2008; 320(5878):877-8. DOI: 10.1126/science.1157679. View