Deep Neural Networks with Controlled Variable Selection for the Identification of Putative Causal Genetic Variants
Overview
Affiliations
Deep neural networks (DNNs) have been successfully utilized in many scientific problems for their high prediction accuracy, but their application to genetic studies remains challenging due to their poor interpretability. Here we consider the problem of scalable, robust variable selection in DNNs for the identification of putative causal genetic variants in genome sequencing studies. We identified a pronounced randomness in feature selection in DNNs due to its stochastic nature, which may hinder interpretability and give rise to misleading results. We propose an interpretable neural network model, stabilized using ensembling, with controlled variable selection for genetic studies. The merit of the proposed method includes: flexible modelling of the nonlinear effect of genetic variants to improve statistical power; multiple knockoffs in the input layer to rigorously control the false discovery rate; hierarchical layers to substantially reduce the number of weight parameters and activations, and improve computational efficiency; and stabilized feature selection to reduce the randomness in identified signals. We evaluate the proposed method in extensive simulation studies and apply it to the analysis of Alzheimer's disease genetics. We show that the proposed method, when compared with conventional linear and nonlinear methods, can lead to substantially more discoveries.
Passemiers A, Folco P, Raimondi D, Birolo G, Moreau Y, Fariselli P Sci Rep. 2024; 14(1):31180.
PMID: 39732866 PMC: 11682240. DOI: 10.1038/s41598-024-82583-5.
Designing interpretable deep learning applications for functional genomics: a quantitative analysis.
van Hilten A, Katz S, Saccenti E, Niessen W, Roshchupkin G Brief Bioinform. 2024; 25(5).
PMID: 39293804 PMC: 11410376. DOI: 10.1093/bib/bbae449.
van Hilten A, van Rooij J, Ikram M, Niessen W, van Meurs J, Roshchupkin G NPJ Syst Biol Appl. 2024; 10(1):81.
PMID: 39095438 PMC: 11297229. DOI: 10.1038/s41540-024-00405-w.
Kassani P, Ehwerhemuepha L, Martin-King C, Kassab R, Gibbs E, Morgan G Pediatr Res. 2023; 95(4):981-987.
PMID: 37993641 DOI: 10.1038/s41390-023-02894-7.