Unravelling the Instability of Mutational Signatures Extraction Archetypal Analysis
Overview
Authors
Affiliations
The high cosine similarity between some single-base substitution mutational signatures and their characteristic flat profiles could suggest the presence of overfitting and mathematical artefacts. The newest version (v3.3) of the signature database available in the Catalogue Of Somatic Mutations In (COSMIC) provides a collection of 79 mutational signatures, which has more than doubled with respect to previous version (30 profiles available in COSMIC signatures v2), making more critical the associations between signatures and specific mutagenic processes. This study both provides a systematic assessment of the extraction task through simulation scenarios based on the latest version of the COSMIC signatures and highlights, through a novel approach using archetypal analysis, which COSMIC signatures are redundant and more likely to be considered as mathematical artefacts. 29 archetypes were able to reconstruct the profile of all the COSMIC signatures with cosine similarity 0.8. Interestingly, these archetypes tend to group similar original signatures sharing either the same aetiology or similar biological processes. We believe that these findings will be useful to encourage the development of new extraction methods avoiding the redundancy of information among the signatures while preserving the biological interpretation.
Focal adhesion in the tumour metastasis: from molecular mechanisms to therapeutic targets.
Liu Z, Zhang X, Ben T, Li M, Jin Y, Wang T Biomark Res. 2025; 13(1):38.
PMID: 40045379 PMC: 11884212. DOI: 10.1186/s40364-025-00745-7.
Pancotti C, Rollo C, Codice F, Birolo G, Fariselli P, Sanavia T Bioinformatics. 2024; 40(5).
PMID: 38754097 PMC: 11139523. DOI: 10.1093/bioinformatics/btae320.
Identifying somatic fingerprints of cancers defined by germline and environmental risk factors.
Chakraborty S, Guan Z, Kostrzewa C, Shen R, Begg C Genet Epidemiol. 2024; 48(8):455-467.
PMID: 38686586 PMC: 11522022. DOI: 10.1002/gepi.22565.
Multiomics-Based Feature Extraction and Selection for the Prediction of Lung Cancer Survival.
Jaksik R, Szumala K, Dinh K, Smieja J Int J Mol Sci. 2024; 25(7).
PMID: 38612473 PMC: 11011391. DOI: 10.3390/ijms25073661.