» Articles » PMID: 39354632

MeSH2Matrix: Combining MeSH Keywords and Machine Learning for Biomedical Relation Classification Based on PubMed

Abstract

Biomedical relation classification has been significantly improved by the application of advanced machine learning techniques on the raw texts of scholarly publications. Despite this improvement, the reliance on large chunks of raw text makes these algorithms suffer in terms of generalization, precision, and reliability. The use of the distinctive characteristics of bibliographic metadata can prove effective in achieving better performance for this challenging task. In this research paper, we introduce an approach for biomedical relation classification using the qualifiers of co-occurring Medical Subject Headings (MeSH). First of all, we introduce MeSH2Matrix, our dataset consisting of 46,469 biomedical relations curated from PubMed publications using our approach. Our dataset includes a matrix that maps associations between the qualifiers of subject MeSH keywords and those of object MeSH keywords. It also specifies the corresponding Wikidata relation type and the superclass of semantic relations for each relation. Using MeSH2Matrix, we build and train three machine learning models (Support Vector Machine [SVM], a dense model [D-Model], and a convolutional neural network [C-Net]) to evaluate the efficiency of our approach for biomedical relation classification. Our best model achieves an accuracy of 70.78% for 195 classes and 83.09% for five superclasses. Finally, we provide confusion matrix and extensive feature analyses to better examine the relationship between the MeSH qualifiers and the biomedical relations being classified. Our results will hopefully shed light on developing better algorithms for biomedical ontology classification based on the MeSH keywords of PubMed publications. For reproducibility purposes, MeSH2Matrix, as well as all our source codes, are made publicly accessible at https://github.com/SisonkeBiotik-Africa/MeSH2Matrix .

Citing Articles

A framework for integrating biomedical knowledge in Wikidata with open biological and biomedical ontologies and MeSH keywords.

Turki H, Chebil K, Dossou B, Emezue C, Owodunni A, Hadj Taieb M Heliyon. 2024; 10(19):e38448.

PMID: 39403518 PMC: 11471508. DOI: 10.1016/j.heliyon.2024.e38448.

References
1.
Leydesdorff L, Comins J, Sorensen A, Bornmann L, Hellsten I . Cited references and Medical Subject Headings (MeSH) as two different knowledge representations: clustering and mappings at the paper level. Scientometrics. 2016; 109(3):2077-2091. PMC: 5124055. DOI: 10.1007/s11192-016-2119-7. View

2.
Cock P, Antao T, Chang J, Chapman B, Cox C, Dalke A . Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009; 25(11):1422-3. PMC: 2682512. DOI: 10.1093/bioinformatics/btp163. View

3.
Tran T, Kavuluru R . Distant supervision for treatment relation extraction by leveraging MeSH subheadings. Artif Intell Med. 2019; 98:18-26. PMC: 6748648. DOI: 10.1016/j.artmed.2019.06.002. View

4.
Sosa D, Hintzen R, Xiong B, de Giorgio A, Fauqueur J, Davies M . Associating biological context with protein-protein interactions through text mining at PubMed scale. J Biomed Inform. 2023; 145:104474. DOI: 10.1016/j.jbi.2023.104474. View

5.
Di Martino F, Delmastro F . Explainable AI for clinical and remote health applications: a survey on tabular and time series data. Artif Intell Rev. 2022; 56(6):5261-5315. PMC: 9607788. DOI: 10.1007/s10462-022-10304-3. View