» Articles » PMID: 25738806

The Precision-recall Plot is More Informative Than the ROC Plot when Evaluating Binary Classifiers on Imbalanced Datasets

Overview
Journal PLoS One
Date 2015 Mar 5
PMID 25738806
Citations 948
Authors
Affiliations
Soon will be listed here.
Abstract

Binary classifiers are routinely evaluated with performance measures such as sensitivity and specificity, and performance is frequently illustrated with Receiver Operating Characteristics (ROC) plots. Alternative measures such as positive predictive value (PPV) and the associated Precision/Recall (PRC) plots are used less frequently. Many bioinformatics studies develop and evaluate classifiers that are to be applied to strongly imbalanced datasets in which the number of negatives outweighs the number of positives significantly. While ROC plots are visually appealing and provide an overview of a classifier's performance across a wide range of specificities, one can ask whether ROC plots could be misleading when applied in imbalanced classification scenarios. We show here that the visual interpretability of ROC plots in the context of imbalanced datasets can be deceptive with respect to conclusions about the reliability of classification performance, owing to an intuitive but wrong interpretation of specificity. PRC plots, on the other hand, can provide the viewer with an accurate prediction of future classification performance due to the fact that they evaluate the fraction of true positives among positive predictions. Our findings have potential implications for the interpretation of a large number of studies that use ROC plots on imbalanced datasets.

Citing Articles

Continuous time and dynamic suicide attempt risk prediction with neural ordinary differential equations.

Sheu Y, Simm J, Wang B, Lee H, Smoller J NPJ Digit Med. 2025; 8(1):161.

PMID: 40082653 PMC: 11906764. DOI: 10.1038/s41746-025-01552-y.


Machine learning prediction of right ventricular volume and ejection fraction from two-dimensional echocardiography in patients with pulmonary regurgitation.

Duong S, Dominy C, Arivazhagan N, Barris D, Hopkins K, Stern K Int J Cardiovasc Imaging. 2025; .

PMID: 40080276 DOI: 10.1007/s10554-025-03368-z.


Disease detection on exterior surfaces of buildings using deep learning in China.

Chen Y, Li D Sci Rep. 2025; 15(1):8564.

PMID: 40074790 PMC: 11904203. DOI: 10.1038/s41598-025-92112-7.


Optimizing Automated Hematoma Expansion Classification from Baseline and Follow-Up Head Computed Tomography.

Tran A, Desser D, Zeevi T, Abou Karam G, Zietz J, DellOrco A Appl Sci (Basel). 2025; 15(1).

PMID: 40046237 PMC: 11882137. DOI: 10.3390/app15010111.


Predicting Health-Related Quality of Life Using Social Determinants of Health: A Machine Learning Approach with the All of Us Cohort.

Abegaz T, Ahmed M, Ali A, Bhagavathula A Bioengineering (Basel). 2025; 12(2).

PMID: 40001685 PMC: 11851811. DOI: 10.3390/bioengineering12020166.


References
1.
Tarca A, Carey V, Chen X, Romero R, Draghici S . Machine learning and its applications to biology. PLoS Comput Biol. 2007; 3(6):e116. PMC: 1904382. DOI: 10.1371/journal.pcbi.0030116. View

2.
Gomes C, Cho J, Hood L, Franco O, Pereira R, Wang K . A Review of Computational Tools in microRNA Discovery. Front Genet. 2013; 4:81. PMC: 3654206. DOI: 10.3389/fgene.2013.00081. View

3.
Hirschhorn J, Daly M . Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005; 6(2):95-108. DOI: 10.1038/nrg1521. View

4.
Berrar D, Flach P . Caveats and pitfalls of ROC analysis in clinical microarray research (and how to avoid them). Brief Bioinform. 2011; 13(1):83-97. DOI: 10.1093/bib/bbr008. View

5.
Swamidass S, Azencott C, Daily K, Baldi P . A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval. Bioinformatics. 2010; 26(10):1348-56. PMC: 2865862. DOI: 10.1093/bioinformatics/btq140. View