Understanding Spatial Language in Radiology: Representation Framework, Annotation, and Spatial Relation Extraction from Chest X-ray Reports Using Deep Learning

Overview

Journal J Biomed Inform

Publisher Elsevier

Specialty Medical Informatics

Date 2020 Jun 21

PMID 32562898

Citations 8

Authors

Surabhi Datta

Yuqi Si

Laritza Rodriguez

Sonya E Shooshan

Dina Demner-Fushman

Kirk Roberts

Affiliations

Soon will be listed here.

Abstract

Radiology reports contain a radiologist's interpretations of images, and these images frequently describe spatial relations. Important radiographic findings are mostly described in reference to an anatomical location through spatial prepositions. Such spatial relationships are also linked to various differential diagnoses and often described through uncertainty phrases. Structured representation of this clinically significant spatial information has the potential to be used in a variety of downstream clinical informatics applications. Our focus is to extract these spatial representations from the reports. For this, we first define a representation framework based on the Spatial Role Labeling (SpRL) scheme, which we refer to as Rad-SpRL. In Rad-SpRL, common radiological entities tied to spatial relations are encoded through four spatial roles: Trajector, Landmark, Diagnosis, and Hedge, all identified in relation to a spatial preposition (or Spatial Indicator). We annotated a total of 2,000 chest X-ray reports following Rad-SpRL. We then propose a deep learning-based natural language processing (NLP) method involving word and character-level encodings to first extract the Spatial Indicators followed by identifying the corresponding spatial roles. Specifically, we use a bidirectional long short-term memory (Bi-LSTM) conditional random field (CRF) neural network as the baseline model. Additionally, we incorporate contextualized word representations from pre-trained language models (BERT and XLNet) for extracting the spatial information. We evaluate both gold and predicted Spatial Indicators to extract the four types of spatial roles. The results are promising, with the highest average F1 measure for Spatial Indicator extraction being 91.29 (XLNet); the highest average overall F1 measure considering all the four spatial roles being 92.9 using gold Indicators (XLNet); and 85.6 using predicted Indicators (BERT pre-trained on MIMIC notes). The corpus is available in Mendeley at http://dx.doi.org/10.17632/yhb26hfz8n.1 and https://github.com/krobertslab/datasets/blob/master/Rad-SpRL.xml.

Citing Articles

A scoping review of large language model based approaches for information extraction from radiology reports.

Reichenpfader D, Muller H, Denecke K NPJ Digit Med. 2024; 7(1):222.

PMID: 39182008 PMC: 11344824. DOI: 10.1038/s41746-024-01219-0.

Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model.

Lau W, Lybarger K, Gunn M, Yetisgen M J Digit Imaging. 2022; 36(1):91-104.

PMID: 36253581 PMC: 9576130. DOI: 10.1007/s10278-022-00717-5.

Increasing Women's Knowledge about HPV Using BERT Text Summarization: An Online Randomized Study.

Bitar H, Babour A, Nafa F, Alzamzami O, Alismail S Int J Environ Res Public Health. 2022; 19(13).

PMID: 35805761 PMC: 9265758. DOI: 10.3390/ijerph19138100.

Identifying ARDS using the Hierarchical Attention Network with Sentence Objectives Framework.

Lybarger K, Mabrey L, Thau M, Bhatraju P, Wurfel M, Yetisgen M AMIA Annu Symp Proc. 2022; 2021:823-832.

PMID: 35308902 PMC: 8861765.

Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance.

Olthof A, van Ooijen P, Cornelissen L J Med Syst. 2021; 45(10):91.

PMID: 34480231 PMC: 8416876. DOI: 10.1007/s10916-021-01761-4.

References

Demner-Fushman D, Kohli M, Rosenman M, Shooshan S, Rodriguez L, Antani S . Preparing a collection of radiology examinations for distribution and retrieval. J Am Med Inform Assoc. 2015; 23(2):304-10. PMC: 5009925. DOI: 10.1093/jamia/ocv080. View

Langlotz C . RadLex: a new method for indexing online educational materials. Radiographics. 2006; 26(6):1595-7. DOI: 10.1148/rg.266065168. View

Roberts K, Rodriguez L, Shooshan S, Demner-Fushman D . Automatic Extraction and Post-coordination of Spatial Relations in Consumer Language. AMIA Annu Symp Proc. 2016; 2015:1083-92. PMC: 4765706. View

Lee J, Yoon W, Kim S, Kim D, Kim S, So C . BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019; 36(4):1234-1240. PMC: 7703786. DOI: 10.1093/bioinformatics/btz682. View

Annarumma M, Withey S, Bakewell R, Pesce E, Goh V, Montana G . Automated Triaging of Adult Chest Radiographs with Deep Artificial Neural Networks. Radiology. 2019; 291(1):196-202. PMC: 6438359. DOI: 10.1148/radiol.2018180921. View

Birchall D . Spatial ability in radiologists: a necessary prerequisite?. Br J Radiol. 2015; 88(1049):20140511. PMC: 4628467. DOI: 10.1259/bjr.20140511. View

Yim W, Denman T, Kwan S, Yetisgen M . Tumor information extraction in radiology reports for hepatocellular carcinoma patients. AMIA Jt Summits Transl Sci Proc. 2016; 2016:455-64. PMC: 5001784. View

Friedman C, Alderson P, Austin J, Cimino J, Johnson S . A general natural-language text processor for clinical radiology. J Am Med Inform Assoc. 1994; 1(2):161-74. PMC: 116194. DOI: 10.1136/jamia.1994.95236146. View

Friedman C, Johnson S, Forman B, Starren J . Architectural requirements for a multipurpose natural language processor in the clinical environment. Proc Annu Symp Comput Appl Med Care. 1995; :347-51. PMC: 2579112. View

10.

Li F, Zhang M, Fu G, Ji D . A neural joint model for entity and relation extraction from biomedical text. BMC Bioinformatics. 2017; 18(1):198. PMC: 5374588. DOI: 10.1186/s12859-017-1609-9. View

11.

Peng Y, Wang X, Lu L, Bagheri M, Summers R, Lu Z . NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Jt Summits Transl Sci Proc. 2018; 2017:188-196. PMC: 5961822. View

12.

Rink B, Roberts K, Harabagiu S, Scheuermann R, Toomay S, Browning T . Extracting actionable findings of appendicitis from radiology reports using natural language processing. AMIA Jt Summits Transl Sci Proc. 2013; 2013:221. PMC: 3845763. View

13.

Sevenster M, van Ommering R, Qian Y . Automatically correlating clinical findings and body locations in radiology reports using MedLEE. J Digit Imaging. 2011; 25(2):240-9. PMC: 3295967. DOI: 10.1007/s10278-011-9411-0. View

14.

Leaman R, Khare R, Lu Z . Challenges in clinical natural language processing for automated disorder normalization. J Biomed Inform. 2015; 57:28-37. PMC: 4713367. DOI: 10.1016/j.jbi.2015.07.010. View

15.

Friedman C, Shagina L, Lussier Y, Hripcsak G . Automated encoding of clinical documents based on natural language processing. J Am Med Inform Assoc. 2004; 11(5):392-402. PMC: 516246. DOI: 10.1197/jamia.M1552. View

16.

Wang Y, Sun L, Jin Q . Enhanced Diagnosis of Pneumothorax with an Improved Real-Time Augmentation for Imbalanced Chest X-rays Data Based on DCNN. IEEE/ACM Trans Comput Biol Bioinform. 2019; 18(3):951-962. DOI: 10.1109/TCBB.2019.2911947. View

17.

Si Y, Wang J, Xu H, Roberts K . Enhancing clinical concept extraction with contextual embeddings. J Am Med Inform Assoc. 2019; 26(11):1297-1304. PMC: 6798561. DOI: 10.1093/jamia/ocz096. View

18.

Zech J, Badgeley M, Liu M, Costa A, Titano J, Oermann E . Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med. 2018; 15(11):e1002683. PMC: 6219764. DOI: 10.1371/journal.pmed.1002683. View

19.

Pesce E, Withey S, Ypsilantis P, Bakewell R, Goh V, Montana G . Learning to detect chest radiographs containing pulmonary lesions using visual attention networks. Med Image Anal. 2019; 53:26-38. DOI: 10.1016/j.media.2018.12.007. View

20.

HAYWARD W, Tarr M . Spatial language and spatial representation. Cognition. 1995; 55(1):39-84. DOI: 10.1016/0010-0277(94)00643-y. View