» Articles » PMID: 35145977

Data Integration Challenges for Machine Learning in Precision Medicine

Overview
Specialty General Medicine
Date 2022 Feb 11
PMID 35145977
Authors
Affiliations
Soon will be listed here.
Abstract

A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.

Citing Articles

The Role of Machine Learning Models in Predicting Cirrhosis Mortality: A Systematic Review.

Mohamud K, Elzubair Eltahir S, Ahmed Alhardalo H, Albashir H, Ali Mohamed Zain N, Abdelrahman Ibrahim M Cureus. 2025; 17(1):e78155.

PMID: 40026938 PMC: 11867977. DOI: 10.7759/cureus.78155.


Advancing Health Care With Digital Twins: Meta-Review of Applications and Implementation Challenges.

Ringeval M, Etindele Sosso F, Cousineau M, Pare G J Med Internet Res. 2025; 27:e69544.

PMID: 39969978 PMC: 11888003. DOI: 10.2196/69544.


The Era of Preemptive Medicine: Developing Medical Digital Twins through Omics, IoT, and AI Integration.

Ooka T JMA J. 2025; 8(1):1-10.

PMID: 39926086 PMC: 11799569. DOI: 10.31662/jmaj.2024-0213.


Unlocking precision medicine: clinical applications of integrating health records, genetics, and immunology through artificial intelligence.

Chen Y, Hsiao T, Lin C, Fann Y J Biomed Sci. 2025; 32(1):16.

PMID: 39915780 PMC: 11804102. DOI: 10.1186/s12929-024-01110-w.


Omilayers: a Python package for efficient data management to support multi-omic analysis.

Kioroglou D BMC Bioinformatics. 2025; 26(1):40.

PMID: 39915756 PMC: 11800426. DOI: 10.1186/s12859-025-06067-7.


References
1.
Michener W . Ten Simple Rules for Creating a Good Data Management Plan. PLoS Comput Biol. 2015; 11(10):e1004525. PMC: 4619636. DOI: 10.1371/journal.pcbi.1004525. View

2.
Bray M, Carpenter A . Quality Control for High-Throughput Imaging Experiments Using Machine Learning in Cellprofiler. Methods Mol Biol. 2017; 1683:89-112. PMC: 6112602. DOI: 10.1007/978-1-4939-7357-6_7. View

3.
Nanni L, Pinoli P, Canakoglu A, Ceri S . PyGMQL: scalable data extraction and analysis for heterogeneous genomic datasets. BMC Bioinformatics. 2019; 20(1):560. PMC: 6842186. DOI: 10.1186/s12859-019-3159-9. View

4.
Papadakis G, Karantanas A, Tsiknakis M, Tsatsakis A, Spandidos D, Marias K . Deep learning opens new horizons in personalized medicine. Biomed Rep. 2019; 10(4):215-217. PMC: 6439426. DOI: 10.3892/br.2019.1199. View

5.
Harrow J, Hancock J, Blomberg N . ELIXIR-EXCELERATE: establishing Europe's data infrastructure for the life science research of the future. EMBO J. 2021; 40(6):e107409. PMC: 7957415. DOI: 10.15252/embj.2020107409. View