Overview of the Problem List Summarization (ProbSum) 2023 Shared Task on Summarizing Patients' Active Diagnoses and Problems from Electronic Health Record Progress Notes

Overview

Journal Proc Conf Assoc Comput Linguist Meet

Date 2023 Aug 16

PMID 37583489

Authors

Yanjun Gao

Dmitriy Dligach

Timothy Miller

Matthew M Churpek

Majid Afshar

Affiliations

Soon will be listed here.

Abstract

The BioNLP Workshop 2023 initiated the launch of a shared task on Problem List Summarization (ProbSum) in January 2023. The aim of this shared task is to attract future research efforts in building NLP models for real-world diagnostic decision support applications, where a system generating relevant and accurate diagnoses will augment the healthcare providers' decision-making process and improve the quality of care for patients. The goal for participants is to develop models that generated a list of diagnoses and problems using input from the daily care notes collected from the hospitalization of critically ill patients. Eight teams submitted their final systems to the shared task leaderboard. In this paper, we describe the tasks, datasets, evaluation metrics, and baseline systems. Additionally, the techniques and results of the evaluation of the different approaches tried by the participating teams are summarized.

Citing Articles

Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study.

Gao Y, Li R, Croxford E, Caskey J, Patterson B, Churpek M JMIR AI. 2025; 4:e58670.

PMID: 39993309 PMC: 11894347. DOI: 10.2196/58670.

Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study.

Adhikary P, Srivastava A, Kumar S, Singh S, Manuja P, Gopinath J JMIR Ment Health. 2024; 11:e57306.

PMID: 39042893 PMC: 11303879. DOI: 10.2196/57306.

Pre-test Prediction of Non-ischemic Cardiomyopathies using Time-Series EHR Data.

Ishwaran K, Abadie B, Chen P, Bolen M, Karamlou T, Grimm R AMIA Jt Summits Transl Sci Proc. 2024; 2024:239-248.

PMID: 38827049 PMC: 11141858.

Development of a Human Evaluation Framework and Correlation with Automated Metrics for Natural Language Generation of Medical Diagnoses.

Croxford E, Gao Y, Patterson B, To D, Tesch S, Dligach D medRxiv. 2024; .

PMID: 38562730 PMC: 10984060. DOI: 10.1101/2024.03.20.24304620.

Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts.

Veen D, Van Uden C, Blankemeier L, Delbrouck J, Aali A, Bluethgen C Res Sq. 2023; .

PMID: 37961377 PMC: 10635391. DOI: 10.21203/rs.3.rs-3483777/v1.

References

Gao Y, Miller T, Xu D, Dligach D, Churpek M, Afshar M . Summarizing Patients' Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models. Proc Int Conf Comput Ling. 2022; 2022:2979-2991. PMC: 9581107. View

Uzuner O, South B, Shen S, DuVall S . 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc. 2011; 18(5):552-6. PMC: 3168320. DOI: 10.1136/amiajnl-2011-000203. View

Gao Y, Dligach D, Christensen L, Tesch S, Laffin R, Xu D . A scoping review of publicly available language tasks in clinical natural language processing. J Am Med Inform Assoc. 2022; 29(10):1797-1806. PMC: 9471718. DOI: 10.1093/jamia/ocac127. View

Yang Z, Yu H . Generating Accurate Electronic Health Assessment from Medical Graph. Proc Conf Empir Methods Nat Lang Process. 2021; 2020:3764-3773. PMC: 7821471. DOI: 10.18653/v1/2020.findings-emnlp.336. View

Furlow B . Information overload and unsustainable workloads in the era of electronic health records. Lancet Respir Med. 2020; 8(3):243-244. DOI: 10.1016/S2213-2600(20)30010-2. View

Brown P, Marquard J, Amster B, Romoser M, Friderici J, Goff S . What do physicians read (and ignore) in electronic progress notes?. Appl Clin Inform. 2014; 5(2):430-44. PMC: 4081746. DOI: 10.4338/ACI-2014-01-RA-0003. View

Hultman G, Marquard J, Lindemann E, Arsoniadis E, Pakhomov S, Melton G . Challenges and Opportunities to Improve the Clinician Experience Reviewing Electronic Progress Notes. Appl Clin Inform. 2019; 10(3):446-453. PMC: 6584143. DOI: 10.1055/s-0039-1692164. View

Demner-Fushman D, Chapman W, McDonald C . What can natural language processing do for clinical decision support?. J Biomed Inform. 2009; 42(5):760-72. PMC: 2757540. DOI: 10.1016/j.jbi.2009.08.007. View

Johnson A, Pollard T, Shen L, Lehman L, Feng M, Ghassemi M . MIMIC-III, a freely accessible critical care database. Sci Data. 2016; 3:160035. PMC: 4878278. DOI: 10.1038/sdata.2016.35. View

10.

Weed L . The problem oriented record as a basic tool in medical education, patient care and clinical research. Ann Clin Res. 1971; 3(3):131-4. View

11.

Liu J, Capurro D, Nguyen A, Verspoor K . "Note Bloat" impacts deep learning-based NLP models for clinical prediction tasks. J Biomed Inform. 2022; 133:104149. DOI: 10.1016/j.jbi.2022.104149. View

12.

Li J, Sun Y, Johnson R, Sciaky D, Wei C, Leaman R . BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database (Oxford). 2016; 2016. PMC: 4860626. DOI: 10.1093/database/baw068. View

13.

Gao Y, Dligach D, Miller T, Tesch S, Laffin R, Churpek M . Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding. LREC Int Conf Lang Resour Eval. 2022; 2022:5484-5493. PMC: 9354726. View

14.

Bodenreider O . The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2003; 32(Database issue):D267-70. PMC: 308795. DOI: 10.1093/nar/gkh061. View

15.

Lederman A, Lederman R, Verspoor K . Tasks as needs: reframing the paradigm of clinical natural language processing research for real-world decision support. J Am Med Inform Assoc. 2022; 29(10):1810-1817. PMC: 9471702. DOI: 10.1093/jamia/ocac121. View