» Articles » PMID: 35395714

Secure Tumor Classification by Shallow Neural Network Using Homomorphic Encryption

Overview
Journal BMC Genomics
Publisher Biomed Central
Specialty Genetics
Date 2022 Apr 9
PMID 35395714
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Disclosure of patients' genetic information in the process of applying machine learning techniques for tumor classification hinders the privacy of personal information. Homomorphic Encryption (HE), which supports operations between encrypted data, can be used as one of the tools to perform such computation without information leakage, but it brings great challenges for directly applying general machine learning algorithms due to the limitations of operations supported by HE. In particular, non-polynomial activation functions, including softmax functions, are difficult to implement with HE and require a suitable approximation method to minimize the loss of accuracy. In the secure genome analysis competition called iDASH 2020, it is presented as a competition task that a multi-label tumor classification method that predicts the class of samples based on genetic information using HE.

Methods: We develop a secure multi-label tumor classification method using HE to ensure privacy during all the computations of the model inference process. Our solution is based on a 1-layer neural network with the softmax activation function model and uses the approximate HE scheme. We present an approximation method that enables softmax activation in the model using HE and a technique for efficiently encoding data to reduce computational costs. In addition, we propose a HE-friendly data filtering method to reduce the size of large-scale genetic data.

Results: We aim to analyze the dataset from The Cancer Genome Atlas (TCGA) dataset, which consists of 3,622 samples from 11 types of cancers, genetic features from 25,128 genes. Our preprocessing method reduces the number of genes to 4,096 or less and achieves a microAUC value of 0.9882 (85% accuracy) with a 1-layer shallow neural network. Using our model, we successfully compute the tumor classification inference steps on the encrypted test data in 3.75 minutes. As a result of exceptionally high microAUC values, our solution was awarded co-first place in iDASH 2020 Track 1: "Secure multi-label Tumor classification using Homomorphic Encryption".

Conclusions: Our solution is the first result of implementing a neural network model with softmax activation using HE. Also, HE optimization methods presented in this work enable machine learning implementation using HE or other challenging HE applications.

Citing Articles

Secure and scalable gene expression quantification with pQuant.

Hong S, Walker C, Choi Y, Gursoy G Nat Commun. 2025; 16(1):2380.

PMID: 40064866 PMC: 11894182. DOI: 10.1038/s41467-025-57393-6.


Privacy-preserving biological age prediction over federated human methylation data using fully homomorphic encryption.

Goldenberg M, Mualem L, Shahar A, Snir S, Akavia A Genome Res. 2024; 34(9):1324-1333.

PMID: 39237299 PMC: 11529865. DOI: 10.1101/gr.279071.124.


Privacy-preserving model evaluation for logistic and linear regression using homomorphically encrypted genotype data.

Hong S, Choi Y, Joo D, Gursoy G J Biomed Inform. 2024; 156:104678.

PMID: 38936565 PMC: 11272436. DOI: 10.1016/j.jbi.2024.104678.

References
1.
Sun Y, Zhu S, Ma K, Liu W, Yue Y, Hu G . Identification of 12 cancer types through genome deep learning. Sci Rep. 2019; 9(1):17256. PMC: 6872744. DOI: 10.1038/s41598-019-53989-3. View

2.
Lee K, Jeong H, Lee S, Jeong W . CPEM: Accurate cancer type classification based on somatic alterations using an ensemble of a random forest and a deep neural network. Sci Rep. 2019; 9(1):16927. PMC: 6858312. DOI: 10.1038/s41598-019-53034-3. View

3.
Kim M, Harmanci A, Bossuat J, Carpov S, Cheon J, Chillotti I . Ultrafast homomorphic encryption models enable secure outsourcing of genotype imputation. Cell Syst. 2021; 12(11):1108-1120.e4. PMC: 9898842. DOI: 10.1016/j.cels.2021.07.010. View

4.
Kim D, Son Y, Kim D, Kim A, Hong S, Cheon J . Privacy-preserving approximate GWAS computation based on homomorphic encryption. BMC Med Genomics. 2020; 13(Suppl 7):77. PMC: 7372890. DOI: 10.1186/s12920-020-0722-1. View

5.
Wu C, Zhou F, Ren J, Li X, Jiang Y, Ma S . A Selective Review of Multi-Level Omics Data Integration Using Variable Selection. High Throughput. 2019; 8(1). PMC: 6473252. DOI: 10.3390/ht8010004. View