Shallow Sparsely-Connected Autoencoders for Gene Set Projection
Overview
Affiliations
When analyzing biological data, it can be helpful to consider gene sets, or predefined groups of biologically related genes. Methods exist for identifying gene sets that are differential between conditions, but large public datasets from consortium projects and single-cell RNA-Sequencing have opened the door for gene set analysis using more sophisticated machine learning techniques, such as autoencoders and variational autoencoders. We present shallow sparsely-connected autoencoders (SSCAs) and variational autoencoders (SSCVAs) as tools for projecting gene-level data onto gene sets. We tested these approaches on single-cell RNA-Sequencing data from blood cells and on RNA-Sequencing data from breast cancer patients. Both SSCA and SSCVA can recover known biological features from these datasets and the SSCVA method often outperforms SSCA (and six existing gene set scoring algorithms) on classification and prediction tasks.
Ruiz-Arenas C, Marin-Goni I, Wang L, Ochoa I, Perez-Jurado L, Hernaez M Nucleic Acids Res. 2024; 52(9):e44.
PMID: 38597610 PMC: 11109970. DOI: 10.1093/nar/gkae197.
Azher Z, Suvarna A, Chen J, Zhang Z, Christensen B, Salas L BioData Min. 2023; 16(1):23.
PMID: 37481666 PMC: 10363299. DOI: 10.1186/s13040-023-00338-w.
Application of Deep Learning on Single-cell RNA Sequencing Data Analysis: A Review.
Brendel M, Su C, Bai Z, Zhang H, Elemento O, Wang F Genomics Proteomics Bioinformatics. 2022; 20(5):814-835.
PMID: 36528240 PMC: 10025684. DOI: 10.1016/j.gpb.2022.11.011.
Artificial neural networks enable genome-scale simulations of intracellular signaling.
Nilsson A, Peters J, Meimetis N, Bryson B, Lauffenburger D Nat Commun. 2022; 13(1):3069.
PMID: 35654811 PMC: 9163072. DOI: 10.1038/s41467-022-30684-y.
Sparsely Connected Autoencoders: A Multi-Purpose Tool for Single Cell omics Analysis.
Alessandri L, Ratto M, Contaldo S, Beccuti M, Cordero F, Arigoni M Int J Mol Sci. 2021; 22(23).
PMID: 34884559 PMC: 8657975. DOI: 10.3390/ijms222312755.