» Articles » PMID: 35668300

Combining SNP-to-gene Linking Strategies to Identify Disease Genes and Assess Disease Omnigenicity

Abstract

Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.

Citing Articles

Specificity, length, and luck: How genes are prioritized by rare and common variant association studies.

Spence J, Mostafavi H, Ota M, Milind N, Gjorgjieva T, Smith C bioRxiv. 2025; .

PMID: 39935885 PMC: 11812597. DOI: 10.1101/2024.12.12.628073.


Prioritizing effector genes at trait-associated loci using multimodal evidence.

Schipper M, de Leeuw C, Maciel B, Wightman D, Hubers N, Boomsma D Nat Genet. 2025; 57(2):323-333.

PMID: 39930082 DOI: 10.1038/s41588-025-02084-7.


Autism-Associated Genes and Neighboring lncRNAs Converge on Key Gene Regulatory Networks.

Andersen R, Talukdar M, Sakamoto T, Song J, Qian X, Lee S bioRxiv. 2025; .

PMID: 39896631 PMC: 11785016. DOI: 10.1101/2025.01.20.634000.


RetroFun-RVS: A Retrospective Family-Based Framework for Rare Variant Analysis Incorporating Functional Annotations.

Mangnier L, Ruczinski I, Ricard J, Moreau C, Girard S, Maziade M Genet Epidemiol. 2025; 49(2):e70001.

PMID: 39876583 PMC: 11775437. DOI: 10.1002/gepi.70001.


3D genomic features across >50 diverse cell types reveal insights into the genomic architecture of childhood obesity.

Trang K, Pahl M, Pippin J, Su C, Littleton S, Sharma P Elife. 2025; 13.

PMID: 39813287 PMC: 11735026. DOI: 10.7554/eLife.95411.


References
1.
de Leeuw C, Mooij J, Heskes T, Posthuma D . MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol. 2015; 11(4):e1004219. PMC: 4401657. DOI: 10.1371/journal.pcbi.1004219. View

2.
Fachal L, Aschard H, Beesley J, Barnes D, Allen J, Kar S . Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes. Nat Genet. 2020; 52(1):56-73. PMC: 6974400. DOI: 10.1038/s41588-019-0537-1. View

3.
Stadhouders R, Aktuna S, Thongjuea S, Aghajanirefah A, Pourfarzad F, van IJcken W . HBS1L-MYB intergenic variants modulate fetal hemoglobin via long-range MYB enhancers. J Clin Invest. 2014; 124(4):1699-710. PMC: 3973089. DOI: 10.1172/JCI71520. View

4.
. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science. 2020; 369(6509):1318-1330. PMC: 7737656. DOI: 10.1126/science.aaz1776. View

5.
Hounkpe B, Chenou F, de Lima F, De Paula E . HRT Atlas v1.0 database: redefining human and mouse housekeeping genes and candidate reference transcripts by mining massive RNA-seq datasets. Nucleic Acids Res. 2020; 49(D1):D947-D955. PMC: 7778946. DOI: 10.1093/nar/gkaa609. View