Automated Extraction of Patient-Centered Outcomes After Breast Cancer Treatment: An Open-Source Large Language Model-Based Toolkit

Overview

Journal JCO Clin Cancer Inform

Specialty Medical Informatics

Date 2024 Aug 21

PMID 39167746

Authors

Man Luo

Shubham Trivedi

Allison W Kurian

Kevin Ward

Theresa H M Keegan

Daniel Rubin

Imon Banerjee

Affiliations

Soon will be listed here.

Abstract

Purpose: Patient-centered outcomes (PCOs) are pivotal in cancer treatment, as they directly reflect patients' quality of life. Although multiple studies suggest that factors affecting breast cancer-related morbidity and survival are influenced by treatment side effects and adherence to long-term treatment, such data are generally only available on a smaller scale or from a single center. The primary challenge with collecting these data is that the outcomes are captured as free text in clinical narratives written by clinicians.

Materials And Methods: Given the complexity of PCO documentation in these narratives, computerized methods are necessary to unlock the wealth of information buried in unstructured text notes that often document PCOs. Inspired by the success of large language models (LLMs), we examined the adaptability of three LLMs, GPT-2, BioGPT, and PMC-LLaMA, on PCO tasks across three institutions, Mayo Clinic, Emory University Hospital, and Stanford University. We developed an open-source framework for fine-tuning LLM that can directly extract the five different categories of PCO from the clinic notes.

Results: We found that these LLMs without fine-tuning (zero-shot) struggle with challenging PCO extraction tasks, displaying almost random performance, even with some task-specific examples (few-shot learning). The performance of our fine-tuned, task-specific models is notably superior compared with their non-fine-tuned LLM models. Moreover, the fine-tuned GPT-2 model has demonstrated a significantly better performance than the other two larger LLMs.

Conclusion: Our discovery indicates that although LLMs serve as effective general-purpose models for tasks across various domains, they require fine-tuning when applied to the clinician domain. Our proposed approach has the potential to lead more efficient, adaptable models for PCO information extraction, reducing reliance on extensive computational resources while still delivering superior performance for specific tasks.

Citing Articles

Large language models in cancer: potentials, risks, and safeguards.

Zitu M, Le T, Duong T, Haddadan S, Garcia M, Amorrortu R BJR Artif Intell. 2025; 2(1):ubae019.

PMID: 39777117 PMC: 11703354. DOI: 10.1093/bjrai/ubae019.

References

Lee S, Lee J, Park J, Park J, Kim D, Lee J . Deep learning-based natural language processing for detecting medical symptoms and histories in emergency patient triage. Am J Emerg Med. 2023; 77:29-38. DOI: 10.1016/j.ajem.2023.11.063. View

Yang L, Manhas D, Howard A, Olson R . Patient-reported outcome use in oncology: a systematic review of the impact on patient-clinician communication. Support Care Cancer. 2017; 26(1):41-60. DOI: 10.1007/s00520-017-3865-7. View

Schmidt M, Scherer S, Wiskemann J, Steindorf K . Return to work after breast cancer: The role of treatment-related side effects and potential impact on quality of life. Eur J Cancer Care (Engl). 2019; 28(4):e13051. DOI: 10.1111/ecc.13051. View

Paladino A, Anderson J, Krukowski R, Waters T, Kocak M, Graff C . THRIVE study protocol: a randomized controlled trial evaluating a web-based app and tailored messages to improve adherence to adjuvant endocrine therapy among women with breast cancer. BMC Health Serv Res. 2019; 19(1):977. PMC: 6924011. DOI: 10.1186/s12913-019-4588-x. View

Ruopp M, Perkins N, Whitcomb B, Schisterman E . Youden Index and optimal cut-point estimated from observations affected by a lower limit of detection. Biom J. 2008; 50(3):419-30. PMC: 2515362. DOI: 10.1002/bimj.200710415. View

Kearney N, McCann L, Norrie J, Taylor L, Gray P, McGee-Lennon M . Evaluation of a mobile phone-based, advanced symptom management system (ASyMS) in the management of chemotherapy-related toxicity. Support Care Cancer. 2008; 17(4):437-44. DOI: 10.1007/s00520-008-0515-0. View

Ramkumar V, Nagarajan R, Shankarnarayan V, Kumaravelu S, Hall J . Implementation and evaluation of a rural community-based pediatric hearing screening program integrating in-person and tele-diagnostic auditory brainstem response (ABR). BMC Health Serv Res. 2019; 19(1):1. PMC: 6318860. DOI: 10.1186/s12913-018-3827-x. View

Sezgin E, Hussain S, Rust S, Huang Y . Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data. JMIR Form Res. 2023; 7:e43014. PMC: 10031450. DOI: 10.2196/43014. View

Benary M, Wang X, Schmidt M, Soll D, Hilfenhaus G, Nassir M . Leveraging Large Language Models for Decision Support in Personalized Oncology. JAMA Netw Open. 2023; 6(11):e2343689. PMC: 10656647. DOI: 10.1001/jamanetworkopen.2023.43689. View

Karamfilov T, Konrad H, Karte K, Wollina U . Lower relapse rate of botulinum toxin A therapy for axillary hyperhidrosis by dose increase. Arch Dermatol. 2000; 136(4):487-90. DOI: 10.1001/archderm.136.4.487. View

10.

Weaver A, Young A, Rowntree J, Townsend N, Pearson S, Smith J . Application of mobile phone technology for managing chemotherapy-associated side-effects. Ann Oncol. 2007; 18(11):1887-92. DOI: 10.1093/annonc/mdm354. View

11.

Lindvall C, Lilley E, Zupanc S, Chien I, Udelsman B, Walling A . Natural Language Processing to Assess End-of-Life Quality Indicators in Cancer Patients Receiving Palliative Surgery. J Palliat Med. 2018; 22(2):183-187. DOI: 10.1089/jpm.2018.0326. View

12.

Li H, Moon J, Iyer D, Balthazar P, Krupinski E, Bercu Z . Decoding radiology reports: Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports. Clin Imaging. 2023; 101:137-141. DOI: 10.1016/j.clinimag.2023.06.008. View

13.

Elmarakeby H, Trukhanov P, Arroyo V, Riaz I, Schrag D, Van Allen E . Empirical evaluation of language modeling to ascertain cancer outcomes from clinical text reports. BMC Bioinformatics. 2023; 24(1):328. PMC: 10474750. DOI: 10.1186/s12859-023-05439-1. View

14.

Wei Q, Ji Z, Si Y, Du J, Wang J, Tiryaki F . Relation Extraction from Clinical Narratives Using Pre-trained Language Models. AMIA Annu Symp Proc. 2020; 2019:1236-1245. PMC: 7153059. View

15.

Cleeland C, GONIN R, Hatfield A, Edmonson J, Blum R, Stewart J . Pain and its treatment in outpatients with metastatic cancer. N Engl J Med. 1994; 330(9):592-6. DOI: 10.1056/NEJM199403033300902. View

16.

Chiang C, Luo M, Dumkrieger G, Trivedi S, Chen Y, Chao C . A large language model-based generative natural language processing framework fine-tuned on clinical notes accurately extracts headache frequency from electronic health records. Headache. 2024; 64(4):400-409. DOI: 10.1111/head.14702. View

17.

Luo R, Sun L, Xia Y, Qin T, Zhang S, Poon H . BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform. 2022; 23(6). DOI: 10.1093/bib/bbac409. View

18.

Sonn G, Sadetsky N, Presti J, Litwin M . Differing perceptions of quality of life in patients with prostate cancer and their doctors. J Urol. 2012; 189(1 Suppl):S59-65. DOI: 10.1016/j.juro.2012.11.032. View

19.

Yang X, Chen A, PourNejatian N, Shin H, Smith K, Parisien C . A large language model for electronic health records. NPJ Digit Med. 2022; 5(1):194. PMC: 9792464. DOI: 10.1038/s41746-022-00742-2. View