Finding Function: Evaluation Methods for Functional Genomic Data
Overview
Affiliations
Background: Accurate evaluation of the quality of genomic or proteomic data and computational methods is vital to our ability to use them for formulating novel biological hypotheses and directing further experiments. There is currently no standard approach to evaluation in functional genomics. Our analysis of existing approaches shows that they are inconsistent and contain substantial functional biases that render the resulting evaluations misleading both quantitatively and qualitatively. These problems make it essentially impossible to compare computational methods or large-scale experimental datasets and also result in conclusions that generalize poorly in most biological applications.
Results: We reveal issues with current evaluation methods here and suggest new approaches to evaluation that facilitate accurate and representative characterization of genomic methods and data. Specifically, we describe a functional genomics gold standard based on curation by expert biologists and demonstrate its use as an effective means of evaluation of genomic approaches. Our evaluation framework and gold standard are freely available to the community through our website.
Conclusion: Proper methods for evaluating genomic data and computational approaches will determine how much we, as a community, are able to learn from the wealth of available data. We propose one possible solution to this problem here but emphasize that this topic warrants broader community discussion.
Meta-analysis of dispensable essential genes and their interactions with bypass suppressors.
Pons C, van Leeuwen J Life Sci Alliance. 2023; 7(1).
PMID: 37918966 PMC: 10622647. DOI: 10.26508/lsa.202302192.
Pividori M, Lu S, Li B, Su C, Johnson M, Wei W Nat Commun. 2023; 14(1):5562.
PMID: 37689782 PMC: 10492839. DOI: 10.1038/s41467-023-41057-4.
Global analysis of the yeast knockout phenome.
Turco G, Chang C, Wang R, Kim G, Stoops E, Richardson B Sci Adv. 2023; 9(21):eadg5702.
PMID: 37235661 PMC: 11326039. DOI: 10.1126/sciadv.adg5702.
A positive statistical benchmark to assess network agreement.
Hao B, Kovacs I Nat Commun. 2023; 14(1):2988.
PMID: 37225699 PMC: 10209207. DOI: 10.1038/s41467-023-38625-z.
The proteomic landscape of genome-wide genetic perturbations.
Messner C, Demichev V, Muenzner J, Aulakh S, Barthel N, Rohl A Cell. 2023; 186(9):2018-2034.e21.
PMID: 37080200 PMC: 7615649. DOI: 10.1016/j.cell.2023.03.026.