Interoperability Between Phenotypes in Research and Healthcare Terminologies--Investigating Partial Mappings Between HPO and SNOMED CT
Overview
Biomedical Engineering
Affiliations
Background: Identifying partial mappings between two terminologies is of special importance when one terminology is finer-grained than the other, as is the case for the Human Phenotype Ontology (HPO), mainly used for research purposes, and SNOMED CT, mainly used in healthcare.
Objectives: To investigate and contrast lexical and logical approaches to deriving partial mappings between HPO and SNOMED CT.
Methods: 1) Lexical approach-We identify modifiers in HPO terms and attempt to map demodified terms to SNOMED CT through UMLS; 2) Logical approach-We leverage subsumption relations in HPO to infer partial mappings to SNOMED CT; 3) Comparison-We analyze the specific contribution of each approach and evaluate the quality of the partial mappings through manual review.
Results: There are 7358 HPO concepts with no complete mapping to SNOMED CT. We identified partial mappings lexically for 33% of them and logically for 82%. We identified partial mappings both lexically and logically for 27%. The clinical relevance of the partial mappings (for a cohort selection use case) is 49% for lexical mappings and 67% for logical mappings.
Conclusions: Through complete and partial mappings, 92% of the 10,454 HPO concepts can be mapped to SNOMED CT (30% complete and 62% partial). Equivalence mappings between HPO and SNOMED CT allow for interoperability between data described using these two systems. However, due to differences in focus and granularity, equivalence is only possible for 30% of HPO classes. In the remaining cases, partial mappings provide a next-best approach for traversing between the two systems. Both lexical and logical mapping techniques produce mappings that cannot be generated by the other technique, suggesting that the two techniques are complementary to each other. Finally, this work demonstrates interesting properties (both lexical and logical) of HPO and SNOMED CT and illustrates some limitations of mapping through UMLS.
The Human Phenotype Ontology in 2024: phenotypes around the world.
Gargano M, Matentzoglu N, Coleman B, Addo-Lartey E, Anagnostopoulos A, Anderton J Nucleic Acids Res. 2023; 52(D1):D1333-D1346.
PMID: 37953324 PMC: 10767975. DOI: 10.1093/nar/gkad1005.
Ontologizing health systems data at scale: making translational discovery a reality.
Callahan T, Stefanski A, Wyrwa J, Zeng C, Ostropolets A, Banda J NPJ Digit Med. 2023; 6(1):89.
PMID: 37208468 PMC: 10196319. DOI: 10.1038/s41746-023-00830-x.
Automated approach for quality assessment of RDF resources.
Zhang S, Benis N, Cornet R BMC Med Inform Decis Mak. 2023; 23(Suppl 1):90.
PMID: 37165363 PMC: 10170671. DOI: 10.1186/s12911-023-02182-8.
McArthur E, Bastarache L, Capra J JAMIA Open. 2023; 6(1):ooad007.
PMID: 36875690 PMC: 9976874. DOI: 10.1093/jamiaopen/ooad007.
Bodenreider O CEUR Workshop Proc. 2022; 1747.
PMID: 36277863 PMC: 9584353.