» Articles » PMID: 37986722

Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT

Overview
Journal ArXiv
Date 2023 Nov 21
PMID 37986722
Authors
Affiliations
Soon will be listed here.
Abstract

To enhance phenotype recognition in clinical notes of genetic diseases, we developed two models - PhenoBCBERT and PhenoGPT - for expanding the vocabularies of Human Phenotype Ontology (HPO) terms. While HPO offers a standardized vocabulary for phenotypes, existing tools often fail to capture the full scope of phenotypes, due to limitations from traditional heuristic or rule-based approaches. Our models leverage large language models (LLMs) to automate the detection of phenotype terms, including those not in the current HPO. We compared these models to PhenoTagger, another HPO recognition tool, and found that our models identify a wider range of phenotype concepts, including previously uncharacterized ones. Our models also showed strong performance in case studies on biomedical literature. We evaluated the strengths and weaknesses of BERT-based and GPT-based models in aspects such as architecture and accuracy. Overall, our models enhance automated phenotype detection from clinical texts, improving downstream analyses on human diseases.

References
1.
Kohler S, Schulz M, Krawitz P, Bauer S, Dolken S, Ott C . Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet. 2009; 85(4):457-64. PMC: 2756558. DOI: 10.1016/j.ajhg.2009.09.003. View

2.
Aronson A . Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proc AMIA Symp. 2002; :17-21. PMC: 2243666. View

3.
Anazi S, Maddirevula S, Salpietro V, Asi Y, Alsahli S, Alhashem A . Expanding the genetic heterogeneity of intellectual disability. Hum Genet. 2017; 136(11-12):1419-1429. DOI: 10.1007/s00439-017-1843-2. View

4.
Martinez-Romero M, Jonquet C, OConnor M, Graybeal J, Pazos A, Musen M . NCBO Ontology Recommender 2.0: an enhanced approach for biomedical ontology recommendation. J Biomed Semantics. 2017; 8(1):21. PMC: 5463318. DOI: 10.1186/s13326-017-0128-y. View

5.
Hartley T, Lemire G, Kernohan K, Howley H, Adams D, Boycott K . New Diagnostic Approaches for Undiagnosed Rare Genetic Diseases. Annu Rev Genomics Hum Genet. 2020; 21:351-372. DOI: 10.1146/annurev-genom-083118-015345. View