PhenoSV: Interpretable Phenotype-aware Model for the Prioritization of Genes Affected by Structural Variants
Overview
Authors
Affiliations
Structural variants (SVs) represent a major source of genetic variation associated with phenotypic diversity and disease susceptibility. While long-read sequencing can discover over 20,000 SVs per human genome, interpreting their functional consequences remains challenging. Existing methods for identifying disease-related SVs focus on deletion/duplication only and cannot prioritize individual genes affected by SVs, especially for noncoding SVs. Here, we introduce PhenoSV, a phenotype-aware machine-learning model that interprets all major types of SVs and genes affected. PhenoSV segments and annotates SVs with diverse genomic features and employs a transformer-based architecture to predict their impacts under a multiple-instance learning framework. With phenotype information, PhenoSV further utilizes gene-phenotype associations to prioritize phenotype-related SVs. Evaluation on extensive human SV datasets covering all SV types demonstrates PhenoSV's superior performance over competing methods. Applications in diseases suggest that PhenoSV can determine disease-related genes from SVs. A web server and a command-line tool for PhenoSV are available at https://phenosv.wglab.org .
Gong T, Jiang J, Uthayopas K, Bornman M, Gheybi K, Stricker P Nat Commun. 2025; 16(1):2400.
PMID: 40064858 PMC: 11893795. DOI: 10.1038/s41467-025-57312-9.
Phased T2T genome assemblies facilitate the mining of disease-resistance genes in .
Luo Y, Liu Z, Jin Z, Li P, Tan X, Cao S Hortic Res. 2025; 12(2):uhae306.
PMID: 39944989 PMC: 11817892. DOI: 10.1093/hr/uhae306.
Systematic assessment of structural variant annotation tools for genomic interpretation.
Liu X, Gu L, Hao C, Xu W, Leng F, Zhang P Life Sci Alliance. 2024; 8(3).
PMID: 39658089 PMC: 11632063. DOI: 10.26508/lsa.202402949.
Hayes V, Gong T, Jiang J, Bornman R, Gheybi K, Stricker P Res Sq. 2024; .
PMID: 38947031 PMC: 11213160. DOI: 10.21203/rs.3.rs-4531885/v1.
Jensen T, Ni B, Reuter C, Gorzynski J, Fazal S, Bonner D medRxiv. 2024; .
PMID: 38585781 PMC: 10996727. DOI: 10.1101/2024.03.22.24304565.