» Articles » PMID: 38546716

Meta-Research: Understudied Genes Are Lost in a Leaky Pipeline Between Genome-wide Assays and Reporting of Results

Overview
Journal Elife
Specialty Biology
Date 2024 Mar 28
PMID 38546716
Authors
Affiliations
Soon will be listed here.
Abstract

Present-day publications on human genes primarily feature genes that already appeared in many publications prior to completion of the Human Genome Project in 2003. These patterns persist despite the subsequent adoption of high-throughput technologies, which routinely identify novel genes associated with biological processes and disease. Although several hypotheses for bias in the selection of genes as research targets have been proposed, their explanatory powers have not yet been compared. Our analysis suggests that understudied genes are systematically abandoned in favor of better-studied genes between the completion of -omics experiments and the reporting of results. Understudied genes remain abandoned by studies that cite these -omics experiments. Conversely, we find that publications on understudied genes may even accrue a greater number of citations. Among 45 biological and experimental factors previously proposed to affect which genes are being studied, we find that 33 are significantly associated with the choice of hit genes presented in titles and abstracts of -omics studies. To promote the investigation of understudied genes, we condense our insights into a tool, (FMUG), that allows scientists to engage with potential bias during the selection of hits. We demonstrate the utility of FMUG through the identification of genes that remain understudied in vertebrate aging. FMUG is developed in Flutter and is available for download at fmug.amaral.northwestern.edu as a MacOS/Windows app.

Citing Articles

Pipeline to explore information on genome editing using large language models and genome editing meta-database.

Suzuki T, Bono H Database (Oxford). 2025; 2025.

PMID: 40056431 PMC: 11890094. DOI: 10.1093/database/baaf022.


Selecting genes for analysis using historically contingent progress: from RNA changes to protein-protein interactions.

Lalit F, Jose A Nucleic Acids Res. 2025; 53(1.

PMID: 39788543 PMC: 11717427. DOI: 10.1093/nar/gkae1246.


Comprehensive identification of GASA genes in sunflower and expression profiling in response to drought.

Asad Ullah M, Ahmed M, AlHusnain L, Zia M, AlKahtani M, Attia K BMC Genomics. 2024; 25(1):954.

PMID: 39402437 PMC: 11472593. DOI: 10.1186/s12864-024-10860-8.


Gene length could be a critical factor in the aging of the genome.

Brouillette M Proc Natl Acad Sci U S A. 2024; 121(37):e2416630121.

PMID: 39236237 PMC: 11406259. DOI: 10.1073/pnas.2416630121.


Underexplored Molecular Mechanisms of Toxicity.

Arowolo O, Suvorov A J Xenobiot. 2024; 14(3):939-949.

PMID: 39051348 PMC: 11270369. DOI: 10.3390/jox14030052.


References
1.
Ellens K, Christian N, Singh C, Satagopam V, May P, Linster C . Confronting the catalytic dark matter encoded by sequenced genomes. Nucleic Acids Res. 2017; 45(20):11495-11514. PMC: 5714238. DOI: 10.1093/nar/gkx937. View

2.
Perdigao N, Rosa A . Dark Proteome Database: Studies on Dark Proteins. High Throughput. 2019; 8(2). PMC: 6630768. DOI: 10.3390/ht8020008. View

3.
Finan C, Gaulton A, Kruger F, Lumbers R, Shah T, Engmann J . The druggable genome and support for target identification and validation in drug development. Sci Transl Med. 2017; 9(383). PMC: 6321762. DOI: 10.1126/scitranslmed.aag1166. View

4.
Karczewski K, Francioli L, Tiao G, Cummings B, Alfoldi J, Wang Q . The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020; 581(7809):434-443. PMC: 7334197. DOI: 10.1038/s41586-020-2308-7. View

5.
Rebhan M, Chalifa-Caspi V, Prilusky J, Lancet D . GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics. 1998; 14(8):656-64. DOI: 10.1093/bioinformatics/14.8.656. View