» Articles » PMID: 38016949

PhenoSV: Interpretable Phenotype-aware Model for the Prioritization of Genes Affected by Structural Variants

Overview
Journal Nat Commun
Specialty Biology
Date 2023 Nov 28
PMID 38016949
Authors
Affiliations
Soon will be listed here.
Abstract

Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for identifying disease-related SVs focus on deletion/duplication only and cannot prioritize individual genes affected by SVs, especially for noncoding SVs. Here, we introduce PhenoSV, a phenotype-aware machine-learning model that interprets all major types of SVs and genes affected. PhenoSV segments and annotates SVs with diverse genomic features and employs a transformer-based architecture to predict their impacts under a multiple-instance learning framework. With phenotype information, PhenoSV further utilizes gene-phenotype associations to prioritize phenotype-related SVs. Evaluation on extensive human SV datasets covering all SV types demonstrates PhenoSV's superior performance over competing methods. Applications in diseases suggest that PhenoSV can determine disease-related genes from SVs. A web server and a command-line tool for PhenoSV are available at https://phenosv.wglab.org .

Citing Articles

Rare pathogenic structural variants show potential to enhance prostate cancer germline testing for African men.

Gong T, Jiang J, Uthayopas K, Bornman M, Gheybi K, Stricker P Nat Commun. 2025; 16(1):2400.

PMID: 40064858 PMC: 11893795. DOI: 10.1038/s41467-025-57312-9.


Phased T2T genome assemblies facilitate the mining of disease-resistance genes in .

Luo Y, Liu Z, Jin Z, Li P, Tan X, Cao S Hortic Res. 2025; 12(2):uhae306.

PMID: 39944989 PMC: 11817892. DOI: 10.1093/hr/uhae306.


Systematic assessment of structural variant annotation tools for genomic interpretation.

Liu X, Gu L, Hao C, Xu W, Leng F, Zhang P Life Sci Alliance. 2024; 8(3).

PMID: 39658089 PMC: 11632063. DOI: 10.26508/lsa.202402949.


Rare pathogenic structural variants show potential to enhance prostate cancer germline testing for African men.

Hayes V, Gong T, Jiang J, Bornman R, Gheybi K, Stricker P Res Sq. 2024; .

PMID: 38947031 PMC: 11213160. DOI: 10.21203/rs.3.rs-4531885/v1.


Integration of transcriptomics and long-read genomics prioritizes structural variants in rare disease.

Jensen T, Ni B, Reuter C, Gorzynski J, Fazal S, Bonner D medRxiv. 2024; .

PMID: 38585781 PMC: 10996727. DOI: 10.1101/2024.03.22.24304565.

References
1.
Gutmann D, Ferner R, Listernick R, Korf B, Wolters P, Johnson K . Neurofibromatosis type 1. Nat Rev Dis Primers. 2017; 3:17004. DOI: 10.1038/nrdp.2017.4. View

2.
Zhao B, Madden J, Lin J, Berry G, Wojcik M, Zhao X . A neurodevelopmental disorder caused by a novel de novo SVA insertion in exon 13 of the SRCAP gene. Eur J Hum Genet. 2022; 30(9):1083-1087. PMC: 9437004. DOI: 10.1038/s41431-022-01137-3. View

3.
Wang Y, Wu N, Liu D, Jin Y . Recurrent Fusion Genes in Leukemia: An Attractive Target for Diagnosis and Treatment. Curr Genomics. 2017; 18(5):378-384. PMC: 5635644. DOI: 10.2174/1389202918666170329110349. View

4.
Stephens P, Greenman C, Fu B, Yang F, Bignell G, Mudie L . Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell. 2011; 144(1):27-40. PMC: 3065307. DOI: 10.1016/j.cell.2010.11.055. View

5.
McCarroll S, Huett A, Kuballa P, Chilewski S, Landry A, Goyette P . Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat Genet. 2009; 40(9):1107-12. PMC: 2731799. DOI: 10.1038/ng.215. View