» Articles » PMID: 24431336

Assisted Annotation of Medical Free Text Using RapTAT

Abstract

Objective: To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias.

Materials And Methods: A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training.

Results: The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85).

Discussion: The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias.

Conclusions: Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%.

Citing Articles

Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness.

Liu J, Wong Z J Am Med Inform Assoc. 2024; 31(11):2632-2640.

PMID: 39081233 PMC: 11491619. DOI: 10.1093/jamia/ocae197.


Markup: A Web-Based Annotation Tool Powered by Active Learning.

Dobbie S, Strafford H, Pickrell W, Fonferko-Shadrach B, Jones C, Akbari A Front Digit Health. 2021; 3:598916.

PMID: 34713086 PMC: 8521860. DOI: 10.3389/fdgth.2021.598916.


The OpenDeID corpus for patient de-identification.

Jonnagaddala J, Chen A, Batongbacal S, Nekkantti C Sci Rep. 2021; 11(1):19973.

PMID: 34620985 PMC: 8497517. DOI: 10.1038/s41598-021-99554-9.


A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging.

Zhang H, Hu D, Duan H, Li S, Wu N, Lu X BMC Med Inform Decis Mak. 2021; 21(Suppl 2):214.

PMID: 34330277 PMC: 8323233. DOI: 10.1186/s12911-021-01575-x.


Are synthetic clinical notes useful for real natural language processing tasks: A case study on clinical entity recognition.

Li J, Zhou Y, Jiang X, Natarajan K, Pakhomov S, Liu H J Am Med Inform Assoc. 2021; 28(10):2193-2201.

PMID: 34272955 PMC: 8449609. DOI: 10.1093/jamia/ocab112.


References
1.
Yancy C, Jessup M, Bozkurt B, Butler J, Casey Jr D, Drazner M . 2013 ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. Circulation. 2013; 128(16):e240-327. DOI: 10.1161/CIR.0b013e31829e8776. View

2.
Chiang J, Lin J, Yang C . Automated evaluation of electronic discharge notes to assess quality of care for cardiovascular diseases using Medical Language Extraction and Encoding System (MedLEE). J Am Med Inform Assoc. 2010; 17(3):245-52. PMC: 2995708. DOI: 10.1136/jamia.2009.000182. View

3.
Matheny M, FitzHenry F, Speroff T, Green J, Griffith M, Vasilevskis E . Detection of infectious symptoms from VA emergency department and primary care clinical documentation. Int J Med Inform. 2012; 81(3):143-56. DOI: 10.1016/j.ijmedinf.2011.11.005. View

4.
Neveol A, Islamaj Dogan R, Lu Z . Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. J Biomed Inform. 2010; 44(2):310-8. PMC: 3063330. DOI: 10.1016/j.jbi.2010.11.001. View

5.
Gobbel G, Reeves R, Jayaramaraja S, Giuse D, Speroff T, Brown S . Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. J Biomed Inform. 2013; 48:54-65. DOI: 10.1016/j.jbi.2013.11.008. View