» Articles » PMID: 34134612

A Machine Learning Framework That Integrates Multi-omics Data Predicts Cancer-related LncRNAs

Overview
Publisher Biomed Central
Specialty Biology
Date 2021 Jun 17
PMID 34134612
Citations 13
Authors
Affiliations
Soon will be listed here.
Abstract

Background: LncRNAs (Long non-coding RNAs) are a type of non-coding RNA molecule with transcript length longer than 200 nucleotides. LncRNA has been novel candidate biomarkers in cancer diagnosis and prognosis. However, it is difficult to discover the true association mechanism between lncRNAs and complex diseases. The unprecedented enrichment of multi-omics data and the rapid development of machine learning technology provide us with the opportunity to design a machine learning framework to study the relationship between lncRNAs and complex diseases.

Results: In this article, we proposed a new machine learning approach, namely LGDLDA (LncRNA-Gene-Disease association networks based LncRNA-Disease Association prediction), for disease-related lncRNAs association prediction based multi-omics data, machine learning methods and neural network neighborhood information aggregation. Firstly, LGDLDA calculates the similarity matrix of lncRNA, gene and disease respectively, and it calculates the similarity between lncRNAs through the lncRNA expression profile matrix, lncRNA-miRNA interaction matrix and lncRNA-protein interaction matrix. We obtain gene similarity matrix by calculating the lncRNA-gene association matrix and the gene-disease association matrix, and we obtain disease similarity matrix by calculating the disease ontology, the disease-miRNA association matrix, and Gaussian interaction profile kernel similarity. Secondly, LGDLDA integrates the neighborhood information in similarity matrices by using nonlinear feature learning of neural network. Thirdly, LGDLDA uses embedded node representations to approximate the observed matrices. Finally, LGDLDA ranks candidate lncRNA-disease pairs and then selects potential disease-related lncRNAs.

Conclusions: Compared with lncRNA-disease prediction methods, our proposed method takes into account more critical information and obtains the performance improvement cancer-related lncRNA predictions. Randomly split data experiment results show that the stability of LGDLDA is better than IDHI-MIRW, NCPLDA, LncDisAP and NCPHLDA. The results on different simulation data sets show that LGDLDA can accurately and effectively predict the disease-related lncRNAs. Furthermore, we applied the method to three real cancer data including gastric cancer, colorectal cancer and breast cancer to predict potential cancer-related lncRNAs.

Citing Articles

Evaluating Neural Network Performance in Predicting Disease Status and Tissue Source of JC Polyomavirus from Patient Isolates Based on the Hypervariable Region of the Viral Genome.

Pike A, Amal S, Maginnis M, Wilczek M Viruses. 2025; 17(1).

PMID: 39861801 PMC: 11769028. DOI: 10.3390/v17010012.


LDAGM: prediction lncRNA-disease asociations by graph convolutional auto-encoder and multilayer perceptron based on multi-view heterogeneous networks.

Zhang B, Wang H, Ma C, Huang H, Fang Z, Qu J BMC Bioinformatics. 2024; 25(1):332.

PMID: 39407120 PMC: 11481433. DOI: 10.1186/s12859-024-05950-z.


Integrating Omics Data and AI for Cancer Diagnosis and Prognosis.

Ozaki Y, Broughton P, Abdollahi H, Valafar H, Blenda A Cancers (Basel). 2024; 16(13).

PMID: 39001510 PMC: 11240413. DOI: 10.3390/cancers16132448.


miRNAs in Heart Development and Disease.

Lozano-Velasco E, Inacio J, Sousa I, Guimaraes A, Franco D, Moura G Int J Mol Sci. 2024; 25(3).

PMID: 38338950 PMC: 10855082. DOI: 10.3390/ijms25031673.


Multi-Omics Mining of lncRNAs with Biological and Clinical Relevance in Cancer.

Salido-Guadarrama I, Romero-Cordoba S, Rueda-Zarazua B Int J Mol Sci. 2023; 24(23).

PMID: 38068923 PMC: 10706612. DOI: 10.3390/ijms242316600.


References
1.
Huang Z, Shi J, Gao Y, Cui C, Zhang S, Li J . HMDD v3.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res. 2018; 47(D1):D1013-D1017. PMC: 6323994. DOI: 10.1093/nar/gky1010. View

2.
Fan X, Zhang S, Zhang S, Zhu K, Lu S . Prediction of lncRNA-disease associations by integrating diverse heterogeneous information sources with RWR algorithm and positive pointwise mutual information. BMC Bioinformatics. 2019; 20(1):87. PMC: 6381749. DOI: 10.1186/s12859-019-2675-y. View

3.
Zhao Y, Li H, Fang S, Kang Y, Wu W, Hao Y . NONCODE 2016: an informative and valuable data source of long non-coding RNAs. Nucleic Acids Res. 2015; 44(D1):D203-8. PMC: 4702886. DOI: 10.1093/nar/gkv1252. View

4.
Pinero J, Bravo A, Queralt-Rosinach N, Gutierrez-Sacristan A, Deu-Pons J, Centeno E . DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res. 2016; 45(D1):D833-D839. PMC: 5210640. DOI: 10.1093/nar/gkw943. View

5.
Liu H, Zhang Z, Wu N, Guo H, Zhang H, Fan D . Integrative Analysis of Dysregulated lncRNA-Associated ceRNA Network Reveals Functional lncRNAs in Gastric Cancer. Genes (Basel). 2018; 9(6). PMC: 6027299. DOI: 10.3390/genes9060303. View