» Articles » PMID: 38527092

PathIntegrate: Multivariate Modelling Approaches for Pathway-based Multi-omics Data Integration

Overview
Specialty Biology
Date 2024 Mar 25
PMID 38527092
Authors
Affiliations
Soon will be listed here.
Abstract

As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.

Citing Articles

Omics Approaches in Understanding Insecticide Resistance in Mosquito Vectors.

Bharadwaj N, Sharma R, Subramanian M, Ragini G, Nagarajan S, Rahi M Int J Mol Sci. 2025; 26(5).

PMID: 40076478 PMC: 11899280. DOI: 10.3390/ijms26051854.


Pathway level metabolomics analysis identifies carbon metabolism as a key factor of incident hypertension in the Estonian Biobank.

Hiie L, Kolde A, Pervjakova N, Reigo A, Abner E, Vosa U Sci Rep. 2025; 15(1):8470.

PMID: 40069276 PMC: 11897224. DOI: 10.1038/s41598-025-92840-w.


PathX-CNN: An Enhanced Explainable Convolutional Neural Network for Survival Prediction and Pathway Analysis in Glioblastoma.

Sobhan M, Islam M, Mondal A bioRxiv. 2025; .

PMID: 39975150 PMC: 11838222. DOI: 10.1101/2025.01.24.634827.


Deciphering the molecular heterogeneity of intermediate- and (very-)high-risk non-muscle-invasive bladder cancer using multi-layered studies.

Akand M, Jatsenko T, Muilwijk T, Gevaert T, Joniau S, Van der Aa F Front Oncol. 2024; 14:1424293.

PMID: 39497708 PMC: 11532112. DOI: 10.3389/fonc.2024.1424293.


Synthetic data generation methods in healthcare: A review on open-source tools and methods.

Pezoulas V, Zaridis D, Mylona E, Androutsos C, Apostolidis K, Tachos N Comput Struct Biotechnol J. 2024; 23:2892-2910.

PMID: 39108677 PMC: 11301073. DOI: 10.1016/j.csbj.2024.07.005.

References
1.
Liu T, Salguero P, Petek M, Martinez-Mira C, Balzano-Nogueira L, Ramsak Z . PaintOmics 4: new tools for the integrative analysis of multi-omics datasets supported by multiple pathway databases. Nucleic Acids Res. 2022; 50(W1):W551-W559. PMC: 9252773. DOI: 10.1093/nar/gkac352. View

2.
Schols A . Nutritional and metabolic modulation in chronic obstructive pulmonary disease management. Eur Respir J Suppl. 2003; 46:81s-86s. DOI: 10.1183/09031936.03.00004611. View

3.
Odom G, Colaprico A, C Silva T, Chen X, Wang L . PathwayMultiomics: An R Package for Efficient Integrative Analysis of Multi-Omics Datasets With Matched or Un-matched Samples. Front Genet. 2022; 12:783713. PMC: 8729182. DOI: 10.3389/fgene.2021.783713. View

4.
Tomfohr J, Lu J, Kepler T . Pathway level analysis of gene expression using singular value decomposition. BMC Bioinformatics. 2005; 6:225. PMC: 1261155. DOI: 10.1186/1471-2105-6-225. View

5.
Fang X, Liu Y, Ren Z, Du Y, Huang Q, Garmire L . Lilikoi V2.0: a deep learning-enabled, personalized pathway-based R package for diagnosis and prognosis predictions using metabolomics data. Gigascience. 2021; 10(1). PMC: 7825009. DOI: 10.1093/gigascience/giaa162. View