» Articles » PMID: 36376793

Combining Genetic Constraint with Predictions of Alternative Splicing to Prioritize Deleterious Splicing in Rare Disease Studies

Overview
Publisher Biomed Central
Specialty Biology
Date 2022 Nov 15
PMID 36376793
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Despite numerous molecular and computational advances, roughly half of patients with a rare disease remain undiagnosed after exome or genome sequencing. A particularly challenging barrier to diagnosis is identifying variants that cause deleterious alternative splicing at intronic or exonic loci outside of canonical donor or acceptor splice sites.

Results: Several existing tools predict the likelihood that a genetic variant causes alternative splicing. We sought to extend such methods by developing a new metric that aids in discerning whether a genetic variant leads to deleterious alternative splicing. Our metric combines genetic variation in the Genome Aggregate Database with alternative splicing predictions from SpliceAI to compare observed and expected levels of splice-altering genetic variation. We infer genic regions with significantly less splice-altering variation than expected to be constrained. The resulting model of regional splicing constraint captures differential splicing constraint across gene and exon categories, and the most constrained genic regions are enriched for pathogenic splice-altering variants. Building from this model, we developed ConSpliceML. This ensemble machine learning approach combines regional splicing constraint with multiple per-nucleotide alternative splicing scores to guide the prediction of deleterious splicing variants in protein-coding genes. ConSpliceML more accurately distinguishes deleterious and benign splicing variants than state-of-the-art splicing prediction methods, especially in "cryptic" splicing regions beyond canonical donor or acceptor splice sites.

Conclusion: Integrating a model of genetic constraint with annotations from existing alternative splicing tools allows ConSpliceML to prioritize potentially deleterious splice-altering variants in studies of rare human diseases.

Citing Articles

Transcriptome-wide outlier approach identifies individuals with minor spliceopathies.

Arriaga M, Mendez R, Ungar R, Bonner D, Matalon D, Lemire G medRxiv. 2025; .

PMID: 39802771 PMC: 11722475. DOI: 10.1101/2025.01.02.24318941.


Identification of a new spliceogenic variant causing severe primary coenzyme Q deficiency.

Alcazar-Fabra M, Ostergaard E, Fernandez-Ayala D, Desbats M, Morbidoni V, Tomas-Gallado L Mol Genet Metab Rep. 2025; 42:101176.

PMID: 39759098 PMC: 11699292. DOI: 10.1016/j.ymgmr.2024.101176.


Identification and analysis of short indels inducing exon extension/shrinkage events.

Qu Z, Sakaguchi N, Kikutake C, Suyama M FEBS Open Bio. 2024; 14(10):1682-1690.

PMID: 39085971 PMC: 11452298. DOI: 10.1002/2211-5463.13871.


All exons are not created equal-exon vulnerability determines the effect of exonic mutations on splicing.

Holm L, Doktor T, Flugt K, Petersen U, Petersen R, Andresen B Nucleic Acids Res. 2024; 52(8):4588-4603.

PMID: 38324470 PMC: 11077056. DOI: 10.1093/nar/gkae077.


Benchmarking splice variant prediction algorithms using massively parallel splicing assays.

Smith C, Kitzman J Genome Biol. 2023; 24(1):294.

PMID: 38129864 PMC: 10734170. DOI: 10.1186/s13059-023-03144-z.


References
1.
Gornemann J, Kotovic K, Hujer K, Neugebauer K . Cotranscriptional spliceosome assembly occurs in a stepwise fashion and requires the cap binding complex. Mol Cell. 2005; 19(1):53-63. DOI: 10.1016/j.molcel.2005.05.007. View

2.
Soemedi R, Cygan K, Rhine C, Wang J, Bulacan C, Yang J . Pathogenic variants that alter protein code often disrupt splicing. Nat Genet. 2017; 49(6):848-855. PMC: 6679692. DOI: 10.1038/ng.3837. View

3.
Lim K, Ferraris L, Filloux M, Raphael B, Fairbrother W . Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes. Proc Natl Acad Sci U S A. 2011; 108(27):11093-8. PMC: 3131313. DOI: 10.1073/pnas.1101135108. View

4.
Firth H, Wright C . The Deciphering Developmental Disorders (DDD) study. Dev Med Child Neurol. 2011; 53(8):702-3. DOI: 10.1111/j.1469-8749.2011.04032.x. View

5.
Cormier M, Belyeu J, Pedersen B, Brown J, Koster J, Quinlan A . Go Get Data (GGD) is a framework that facilitates reproducible access to genomic data. Nat Commun. 2021; 12(1):2151. PMC: 8041854. DOI: 10.1038/s41467-021-22381-z. View