Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics
Overview
Affiliations
Interpreting the potentially vast number of hypotheses generated by a shotgun proteomics experiment requires a valid and accurate procedure for assigning statistical confidence estimates to identified tandem mass spectra. Despite the crucial role such procedures play in most high-throughput proteomics experiments, the scientific literature has not reached a consensus about the best confidence estimation methodology. In this work, we evaluate, using theoretical and empirical analysis, four previously proposed protocols for estimating the false discovery rate (FDR) associated with a set of identified tandem mass spectra: two variants of the target-decoy competition protocol (TDC) of Elias and Gygi and two variants of the separate target-decoy search protocol of Käll et al. Our analysis reveals significant biases in the two separate target-decoy search protocols. Moreover, the one TDC protocol that provides an unbiased FDR estimate among the target PSMs does so at the cost of forfeiting a random subset of high-scoring spectrum identifications. We therefore propose the mix-max procedure to provide unbiased, accurate FDR estimates in the presence of well-calibrated scores. The method avoids biases associated with the two separate target-decoy search protocols and also avoids the propensity for target-decoy competition to discard a random subset of high-scoring target identifications.
Query Mix-Max Method for FDR Estimation Supported by Entrapment Queries.
Madej D, Lam H J Proteome Res. 2025; 24(3):1135-1147.
PMID: 39907052 PMC: 11894652. DOI: 10.1021/acs.jproteome.4c00744.
PyViscount: Validating False Discovery Rate Estimation Methods via Random Search Space Partition.
Madej D, Lam H J Proteome Res. 2025; 24(3):1118-1134.
PMID: 39905949 PMC: 11894659. DOI: 10.1021/acs.jproteome.4c00743.
Ion entropy and accurate entropy-based FDR estimation in metabolomics.
An S, Lu M, Wang R, Wang J, Jiang H, Xie C Brief Bioinform. 2024; 25(2).
PMID: 38426325 PMC: 10939419. DOI: 10.1093/bib/bbae056.
Smith M, Simpson Z, Marcotte E PLoS Comput Biol. 2023; 19(5):e1011157.
PMID: 37253025 PMC: 10256185. DOI: 10.1371/journal.pcbi.1011157.
Analyzing rare mutations in metagenomes assembled using long and accurate reads.
Fedarko M, Kolmogorov M, Pevzner P Genome Res. 2022; 32(11-12):2119-2133.
PMID: 36418060 PMC: 9808630. DOI: 10.1101/gr.276917.122.