» Articles » PMID: 34514472

Biomedical Vocabulary Alignment at Scale in the UMLS Metathesaurus

Overview
Publisher Elsevier
Date 2021 Sep 13
PMID 34514472
Citations 8
Authors
Affiliations
Soon will be listed here.
Abstract

With 214 source vocabularies, the construction and maintenance process of the UMLS (Unified Medical Language System) Metathesaurus terminology integration system is costly, time-consuming, and error-prone as it primarily relies on (1) lexical and semantic processing for suggesting groupings of synonymous terms, and (2) the expertise of UMLS editors for curating these synonymy predictions. This paper aims to improve the UMLS Metathesaurus construction process by developing a novel supervised learning approach for improving the task of suggesting synonymous pairs that can scale to the size and diversity of the UMLS source vocabularies. We evaluate this deep learning (DL) approach against a rule-based approach (RBA) that approximates the current UMLS Metathesaurus construction process. The key to the generalizability of our approach is the use of various degrees of lexical similarity in negative pairs during the training process. Our initial experiments demonstrate the strong performance across multiple datasets of our DL approach in terms of recall (91-92%), precision (88-99%), and F1 score (89-95%). Our DL approach largely outperforms the RBA method in recall (+23%), precision (+2.4%), and F1 score (+14.1%). This novel approach has great potential for improving the UMLS Metathesaurus construction process by providing better synonymy suggestions to the UMLS editors.

Citing Articles

Assessing the impact of transitioning to 11th revision of the International Classification of Diseases (ICD-11) on comorbidity indices.

Nikiema J, Thiam D, Bayani A, Ayotte A, Sourial N, Bally M J Am Med Inform Assoc. 2024; 31(6):1219-1226.

PMID: 38489540 PMC: 11105143. DOI: 10.1093/jamia/ocae046.


Mapping Chinese Medical Entities to the Unified Medical Language System.

Chen L, Qi Y, Wu A, Deng L, Jiang T Health Data Sci. 2024; 3:0011.

PMID: 38487197 PMC: 10880171. DOI: 10.34133/hds.0011.


A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.

Hao X, Abeysinghe R, Shi J, Cui L AMIA Annu Symp Proc. 2024; 2023:977-986.

PMID: 38222357 PMC: 10785861.


Two complementary AI approaches for predicting UMLS semantic group assignment: heuristic reasoning and deep learning.

Mao Y, Miller R, Bodenreider O, Nguyen V, Fung K J Am Med Inform Assoc. 2023; 30(12):1887-1894.

PMID: 37528056 PMC: 10654847. DOI: 10.1093/jamia/ocad152.


Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus.

Nguyen V, Yip H, Bajaj G, Wijesiriwardene T, Javangula V, Parthasarathy S Proc Int World Wide Web Conf. 2022; 2022:1037-1046.

PMID: 36108322 PMC: 9455675. DOI: 10.1145/3485447.3511946.


References
1.
Tran T, Nghiem S, Le V, Quan T, Nguyen V, Yip H . Siamese KG-LSTM: A deep learning model for enriching UMLS Metathesaurus synonymy. Int Conf Knowl Syst Eng. 2022; 2020:281-286. PMC: 9584311. DOI: 10.1109/kse50997.2020.9287797. View

2.
Jimenez-Ruiz E, Grau B, Horrocks I, Berlanga R . Logic-based assessment of the compatibility of UMLS ontology sources. J Biomed Semantics. 2011; 2 Suppl 1:S2. PMC: 3105494. DOI: 10.1186/2041-1480-2-S1-S2. View

3.
Zhang Y, Chen Q, Yang Z, Lin H, Lu Z . BioWordVec, improving biomedical word embeddings with subword information and MeSH. Sci Data. 2019; 6(1):52. PMC: 6510737. DOI: 10.1038/s41597-019-0055-0. View

4.
Cimino J, Min H, Perl Y . Consistency across the hierarchies of the UMLS Semantic Network and Metathesaurus. J Biomed Inform. 2004; 36(6):450-61. DOI: 10.1016/j.jbi.2003.11.001. View

5.
Morrey C, Geller J, Halper M, Perl Y . The Neighborhood Auditing Tool: a hybrid interface for auditing the UMLS. J Biomed Inform. 2009; 42(3):468-89. PMC: 2891659. DOI: 10.1016/j.jbi.2009.01.006. View