Identifying and Extracting Rare Diseases and Their Phenotypes with Large Language Models

Overview

Journal J Healthc Inform Res

Specialty Medical Informatics

Date 2024 Apr 29

PMID 38681753

Authors

Cathy Shyr

Yan Hu

Lisa Bastarache

Alex Cheng

Rizwan Hamid

Paul Harris

Hua Xu

Affiliations

Soon will be listed here.

Abstract

Purpose: Phenotyping is critical for informing rare disease diagnosis and treatment, but disease phenotypes are often embedded in unstructured text. While natural language processing (NLP) can automate extraction, a major bottleneck is developing annotated corpora. Recently, prompt learning with large language models (LLMs) has been shown to lead to generalizable results without any (zero-shot) or few annotated samples (few-shot), but none have explored this for rare diseases. Our work is the first to study prompt learning for identifying and extracting rare disease phenotypes in the zero- and few-shot settings.

Methods: We compared the performance of prompt learning with ChatGPT and fine-tuning with BioClinicalBERT. We engineered novel prompts for ChatGPT to identify and extract rare diseases and their phenotypes (e.g., diseases, symptoms, and signs), established a benchmark for evaluating its performance, and conducted an in-depth error analysis.

Results: Overall, fine-tuning BioClinicalBERT resulted in higher performance (F1 of 0.689) than ChatGPT (F1 of 0.472 and 0.610 in the zero- and few-shot settings, respectively). However, ChatGPT achieved higher accuracy for rare diseases and signs in the one-shot setting (F1 of 0.778 and 0.725). Conversational, sentence-based prompts generally achieved higher accuracy than structured lists.

Conclusion: Prompt learning using ChatGPT has the potential to match or outperform fine-tuning BioClinicalBERT at extracting rare diseases and signs with just one annotated sample. Given its accessibility, ChatGPT could be leveraged to extract these entities without relying on a large, annotated corpus. While LLMs can support rare disease phenotyping, researchers should critically evaluate model outputs to ensure phenotyping accuracy.

Citing Articles

An Automatic and End-to-End System for Rare Disease Knowledge Graph Construction Based on Ontology-Enhanced Large Language Models: Development Study.

Cao L, Sun J, Cross A JMIR Med Inform. 2024; 12:e60665.

PMID: 39693482 PMC: 11683654. DOI: 10.2196/60665.

SEETrials: Leveraging large language models for safety and efficacy extraction in oncology clinical trials.

Lee K, Paek H, Huang L, Hilton C, Datta S, Higashi J Inform Med Unlocked. 2024; 50.

PMID: 39493413 PMC: 11530223. DOI: 10.1016/j.imu.2024.101589.

Large Language Models in Biomedical and Health Informatics: A Review with Bibliometric Analysis.

Yu H, Fan L, Li L, Zhou J, Ma Z, Xian L J Healthc Inform Res. 2024; 8(4):658-711.

PMID: 39463859 PMC: 11499577. DOI: 10.1007/s41666-024-00171-8.

A hybrid framework with large language models for rare disease phenotyping.

Wu J, Dong H, Li Z, Wang H, Li R, Patra A BMC Med Inform Decis Mak. 2024; 24(1):289.

PMID: 39375687 PMC: 11460004. DOI: 10.1186/s12911-024-02698-7.

References

Segura-Bedmar I, Camino-Perdones D, Guerrero-Aspizua S . Exploring deep learning methods for recognizing rare diseases and their clinical manifestations from texts. BMC Bioinformatics. 2022; 23(1):263. PMC: 9258216. DOI: 10.1186/s12859-022-04810-y. View

Carmichael N, Tsipis J, Windmueller G, Mandel L, Estrella E . "Is it going to hurt?": the impact of the diagnostic odyssey on children and their families. J Genet Couns. 2014; 24(2):325-35. DOI: 10.1007/s10897-014-9773-9. View

Ahmad F, Ricket I, Hammill B, Eskenazi L, Robertson H, Curtis L . Computable Phenotype Implementation for a National, Multicenter Pragmatic Clinical Trial: Lessons Learned From ADAPTABLE. Circ Cardiovasc Qual Outcomes. 2020; 13(6):e006292. PMC: 7321832. DOI: 10.1161/CIRCOUTCOMES.119.006292. View

Fabregat H, Araujo L, Martinez-Romo J . Deep neural models for extracting entities and relationships in the new RDD corpus relating disabilities and rare diseases. Comput Methods Programs Biomed. 2018; 164:121-129. DOI: 10.1016/j.cmpb.2018.07.007. View

Chung C, Chu A, Chung B . Rare disease emerging as a global public health priority. Front Public Health. 2022; 10:1028545. PMC: 9632971. DOI: 10.3389/fpubh.2022.1028545. View

Tifft C, Adams D . The National Institutes of Health undiagnosed diseases program. Curr Opin Pediatr. 2014; 26(6):626-33. PMC: 4302336. DOI: 10.1097/MOP.0000000000000155. View

Macnamara E, DSouza P, Tifft C . The undiagnosed diseases program: Approach to diagnosis. Transl Sci Rare Dis. 2020; 4(3-4):179-188. PMC: 7250153. DOI: 10.3233/TRD-190045. View

Martinez-deMiguel C, Segura-Bedmar I, Chacon-Solano E, Guerrero-Aspizua S . The RareDis corpus: A corpus annotated with rare diseases, their signs and symptoms. J Biomed Inform. 2021; 125:103961. DOI: 10.1016/j.jbi.2021.103961. View

Chapman M, Dominguez J, Fairweather E, Delaney B, Curcin V . Using Computable Phenotypes in Point-of-Care Clinical Trial Recruitment. Stud Health Technol Inform. 2021; 281:560-564. DOI: 10.3233/SHTI210233. View

10.

Cohen J, Biesecker B . Quality of life in rare genetic conditions: a systematic review of the literature. Am J Med Genet A. 2010; 152A(5):1136-56. PMC: 3113481. DOI: 10.1002/ajmg.a.33380. View

11.

Davis M, Sriram S, Bush W, Denny J, Haines J . Automated extraction of clinical traits of multiple sclerosis in electronic medical records. J Am Med Inform Assoc. 2013; 20(e2):e334-40. PMC: 3861927. DOI: 10.1136/amiajnl-2013-001999. View

12.

Yang G, Cintina I, Pariser A, Oehrlein E, Sullivan J, Kennedy A . The national economic burden of rare disease in the United States in 2019. Orphanet J Rare Dis. 2022; 17(1):163. PMC: 9004040. DOI: 10.1186/s13023-022-02299-5. View

13.

Nigwekar S, Solid C, Ankers E, Malhotra R, Eggert W, Turchin A . Quantifying a rare disease in administrative data: the example of calciphylaxis. J Gen Intern Med. 2014; 29 Suppl 3:S724-31. PMC: 4124115. DOI: 10.1007/s11606-014-2910-1. View

14.

Johnson A, Pollard T, Shen L, Lehman L, Feng M, Ghassemi M . MIMIC-III, a freely accessible critical care database. Sci Data. 2016; 3:160035. PMC: 4878278. DOI: 10.1038/sdata.2016.35. View

15.

Wakap S, Lambert D, Olry A, Rodwell C, Gueydan C, Lanneau V . Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database. Eur J Hum Genet. 2019; 28(2):165-173. PMC: 6974615. DOI: 10.1038/s41431-019-0508-0. View

16.

Lo Barco T, Kuchenbuch M, Garcelon N, Neuraz A, Nabbout R . Improving early diagnosis of rare diseases using Natural Language Processing in unstructured medical records: an illustration from Dravet syndrome. Orphanet J Rare Dis. 2021; 16(1):309. PMC: 8278630. DOI: 10.1186/s13023-021-01936-9. View

17.

Deisseroth C, Birgmeier J, Bodle E, Kohler J, Matalon D, Nazarenko Y . ClinPhen extracts and prioritizes patient phenotypes directly from medical records to expedite genetic disease diagnosis. Genet Med. 2018; 21(7):1585-1593. PMC: 6551315. DOI: 10.1038/s41436-018-0381-1. View

18.

Taylor N, Zhang Y, Joyce D, Gao Z, Kormilitzin A, Nevado-Holgado A . Clinical Prompt Learning With Frozen Language Models. IEEE Trans Neural Netw Learn Syst. 2023; 35(11):16453-16463. DOI: 10.1109/TNNLS.2023.3294633. View

19.

Wang Y, Wang L, Rastegar-Mojarad M, Moon S, Shen F, Afzal N . Clinical information extraction applications: A literature review. J Biomed Inform. 2017; 77:34-49. PMC: 5771858. DOI: 10.1016/j.jbi.2017.11.011. View