» Articles » PMID: 30309350

Logistic Regression over Encrypted Data from Fully Homomorphic Encryption

Overview
Publisher Biomed Central
Specialty Genetics
Date 2018 Oct 13
PMID 30309350
Citations 14
Authors
Affiliations
Soon will be listed here.
Abstract

Background: One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on specific mutations, the idea was for the data holder to encrypt the records using homomorphic encryption, and send them to an untrusted cloud for storage. The cloud could then homomorphically apply a training algorithm on the encrypted data to obtain an encrypted logistic regression model, which can be sent to the data holder for decryption. In this way, the data holder could successfully outsource the training process without revealing either her sensitive data, or the trained model, to the cloud.

Methods: Our solution to this problem has several novelties: we use a multi-bit plaintext space in fully homomorphic encryption together with fixed point number encoding; we combine bootstrapping in fully homomorphic encryption with a scaling operation in fixed point arithmetic; we use a minimax polynomial approximation to the sigmoid function and the 1-bit gradient descent method to reduce the plaintext growth in the training process.

Results: Our algorithm for training over encrypted data takes 0.4-3.2 hours per iteration of gradient descent.

Conclusions: We demonstrate the feasibility but high computational cost of training over encrypted data. On the other hand, our method can guarantee the highest level of data privacy in critical applications.

Citing Articles

Using encrypted genotypes and phenotypes for collaborative genomic analyses to maintain data confidentiality.

Zhao T, Wang F, Mott R, Dekkers J, Cheng H Genetics. 2023; 226(3).

PMID: 38085098 PMC: 11090459. DOI: 10.1093/genetics/iyad210.


The evolving privacy and security concerns for genomic data analysis and sharing as observed from the iDASH competition.

Kuo T, Jiang X, Tang H, Wang X, Harmanci A, Kim M J Am Med Inform Assoc. 2022; 29(12):2182-2190.

PMID: 36164820 PMC: 9667175. DOI: 10.1093/jamia/ocac165.


Machine Learning for Healthcare Wearable Devices: The Big Picture.

Sabry F, Eltaras T, Labda W, Alzoubi K, Malluhi Q J Healthc Eng. 2022; 2022:4653923.

PMID: 35480146 PMC: 9038375. DOI: 10.1155/2022/4653923.


Secure tumor classification by shallow neural network using homomorphic encryption.

Hong S, Park J, Cho W, Choe H, Cheon J BMC Genomics. 2022; 23(1):284.

PMID: 35395714 PMC: 8994372. DOI: 10.1186/s12864-022-08469-w.


Privacy-preserving deep learning for pervasive health monitoring: a study of environment requirements and existing solutions adequacy.

Boulemtafes A, Derhab A, Challal Y Health Technol (Berl). 2022; 12(2):285-304.

PMID: 35136708 PMC: 8813181. DOI: 10.1007/s12553-022-00640-3.