» Articles » PMID: 36303018

A Framework for Detecting Noncoding Rare-variant Associations of Large-scale Whole-genome Sequencing Studies

Abstract

Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare-variant (RV) associations with complex human diseases and traits. Variant-set analysis is a powerful approach to study RV association. However, existing methods have limited ability in analyzing the noncoding genome. We propose a computationally efficient and robust noncoding RV association detection framework, STAARpipeline, to automatically annotate a whole-genome sequencing study and perform flexible noncoding RV association analysis, including gene-centric analysis and fixed window-based and dynamic window-based non-gene-centric analysis by incorporating variant functional annotations. In gene-centric analysis, STAARpipeline uses STAAR to group noncoding variants based on functional categories of genes and incorporate multiple functional annotations. In non-gene-centric analysis, STAARpipeline uses SCANG-STAAR to incorporate dynamic window sizes and multiple functional annotations. We apply STAARpipeline to identify noncoding RV sets associated with four lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several of them in an additional 9,123 TOPMed samples. We also analyze five non-lipid TOPMed traits.

Citing Articles

Streamlining Large-Scale Genomic Data Management: Insights from the UK Biobank Whole-Genome Sequencing Data.

Li X, Wood A, Yuan Y, Zhang M, Huang Y, Hawkes G medRxiv. 2025; .

PMID: 39974066 PMC: 11838927. DOI: 10.1101/2025.01.27.25321225.


Assessment of the functionality and usability of open-source rare variant analysis pipelines.

Riccio C, Jansen M, Thalen F, Koliopanos G, Link V, Ziegler A Brief Bioinform. 2025; 26(1).

PMID: 39907318 PMC: 11795309. DOI: 10.1093/bib/bbaf044.


RetroFun-RVS: A Retrospective Family-Based Framework for Rare Variant Analysis Incorporating Functional Annotations.

Mangnier L, Ruczinski I, Ricard J, Moreau C, Girard S, Maziade M Genet Epidemiol. 2025; 49(2):e70001.

PMID: 39876583 PMC: 11775437. DOI: 10.1002/gepi.70001.


Zim4rv: an R package to modeling zero-inflated count phenotype on regional-based rare variants.

Liu X, Li Y, Fan Q BMC Bioinformatics. 2025; 26(1):18.

PMID: 39819419 PMC: 11740424. DOI: 10.1186/s12859-024-06029-5.


Review on GPU accelerated methods for genome-wide SNP-SNP interactions.

Ren W, Liang Z Mol Genet Genomics. 2024; 300(1):10.

PMID: 39738695 DOI: 10.1007/s00438-024-02214-6.


References
1.
Manolio T, Collins F, Cox N, Goldstein D, Hindorff L, Hunter D . Finding the missing heritability of complex diseases. Nature. 2009; 461(7265):747-53. PMC: 2831613. DOI: 10.1038/nature08494. View

2.
Wainschtein P, Jain D, Zheng Z, Cupples L, Shadyab A, McKnight B . Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat Genet. 2022; 54(3):263-273. PMC: 9119698. DOI: 10.1038/s41588-021-00997-7. View

3.
Hernandez R, Uricchio L, Hartman K, Ye C, Dahl A, Zaitlen N . Ultrarare variants drive substantial cis heritability of human gene expression. Nat Genet. 2019; 51(9):1349-1355. PMC: 6730564. DOI: 10.1038/s41588-019-0487-7. View

4.
Taliun D, Harris D, Kessler M, Carlson J, Szpiech Z, Torres R . Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021; 590(7845):290-299. PMC: 7875770. DOI: 10.1038/s41586-021-03205-y. View

5.
Flannick J, Mercader J, Fuchsberger C, Udler M, Mahajan A, Wessel J . Exome sequencing of 20,791 cases of type 2 diabetes and 24,440 controls. Nature. 2019; 570(7759):71-76. PMC: 6699738. DOI: 10.1038/s41586-019-1231-2. View