» Articles » PMID: 26490630

Methods for Biological Data Integration: Perspectives and Challenges

Overview
Date 2015 Oct 23
PMID 26490630
Citations 92
Authors
Affiliations
Soon will be listed here.
Abstract

Rapid technological advances have led to the production of different types of biological data and enabled construction of complex networks with various types of interactions between diverse biological entities. Standard network data analysis methods were shown to be limited in dealing with such heterogeneous networked data and consequently, new methods for integrative data analyses have been proposed. The integrative methods can collectively mine multiple types of biological data and produce more holistic, systems-level biological insights. We survey recent methods for collective mining (integration) of various types of networked biological data. We compare different state-of-the-art methods for data integration and highlight their advantages and disadvantages in addressing important biological problems. We identify the important computational challenges of these methods and provide a general guideline for which methods are suited for specific biological problems, or specific data types. Moreover, we propose that recent non-negative matrix factorization-based approaches may become the integration methodology of choice, as they are well suited and accurate in dealing with heterogeneous data and have many opportunities for further development.

Citing Articles

A computational framework for extracting biological insights from SRA cancer data.

Guimaraes P, Carvalho M, Ruiz J Sci Rep. 2025; 15(1):8117.

PMID: 40057525 PMC: 11890766. DOI: 10.1038/s41598-025-91781-8.


Simplicity within biological complexity.

Przulj N, Malod-Dognin N Bioinform Adv. 2025; 5(1):vbae164.

PMID: 39927291 PMC: 11805345. DOI: 10.1093/bioadv/vbae164.


Utilizing Feature Selection Techniques for AI-Driven Tumor Subtype Classification: Enhancing Precision in Cancer Diagnostics.

Wang J, Zhang Z, Wang Y Biomolecules. 2025; 15(1).

PMID: 39858475 PMC: 11763904. DOI: 10.3390/biom15010081.


Prediction of future dementia among patients with mild cognitive impairment (MCI) by integrating multimodal clinical data.

Cirincione A, Lynch K, Bennett J, Choupan J, Varghese B, Sheikh-Bahaei N Heliyon. 2024; 10(17):e36728.

PMID: 39281465 PMC: 11399681. DOI: 10.1016/j.heliyon.2024.e36728.


Constructing a Clinical Patient Similarity Network of Gastric Cancer.

Zhang R, Liu Z, Zhu C, Cai H, Yin K, Zhong F Bioengineering (Basel). 2024; 11(8).

PMID: 39199766 PMC: 11351872. DOI: 10.3390/bioengineering11080808.


References
1.
Kato T, Tsuda K, Asai K . Selective integration of multiple biological data for supervised network inference. Bioinformatics. 2005; 21(10):2488-95. DOI: 10.1093/bioinformatics/bti339. View

2.
Hirschhorn J, Daly M . Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005; 6(2):95-108. DOI: 10.1038/nrg1521. View

3.
Ashburn T, Thor K . Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004; 3(8):673-83. DOI: 10.1038/nrd1468. View

4.
Ma X, Chen T, Sun F . Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks. Brief Bioinform. 2013; 15(5):685-98. PMC: 4271058. DOI: 10.1093/bib/bbt041. View

5.
Denny J, Bastarache L, Ritchie M, Carroll R, Zink R, Mosley J . Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol. 2013; 31(12):1102-10. PMC: 3969265. DOI: 10.1038/nbt.2749. View