Assisted Annotation of Medical Free Text Using RapTAT

Overview

Journal J Am Med Inform Assoc

Publisher Oxford University Press

Specialty Medical Informatics

Date 2014 Jan 17

PMID 24431336

Citations 17

Authors

Glenn T Gobbel

Jennifer Garvin

Ruth Reeves

Robert M Cronin

Julia Heavirland

Jenifer Williams

Allison Weaver

Shrimalini Jayaramaraja

Dario Giuse

Theodore Speroff

Steven H Brown

Hua Xu

Michael E Matheny

Affiliations

Soon will be listed here.

Abstract

Objective: To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias.

Materials And Methods: A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training.

Results: The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85).

Discussion: The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias.

Conclusions: Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%.

Citing Articles

Utilizing active learning strategies in machine-assisted annotation for clinical named entity recognition: a comprehensive analysis considering annotation costs and target effectiveness.

Liu J, Wong Z J Am Med Inform Assoc. 2024; 31(11):2632-2640.

PMID: 39081233 PMC: 11491619. DOI: 10.1093/jamia/ocae197.

Markup: A Web-Based Annotation Tool Powered by Active Learning.

Dobbie S, Strafford H, Pickrell W, Fonferko-Shadrach B, Jones C, Akbari A Front Digit Health. 2021; 3:598916.

PMID: 34713086 PMC: 8521860. DOI: 10.3389/fdgth.2021.598916.

The OpenDeID corpus for patient de-identification.

Jonnagaddala J, Chen A, Batongbacal S, Nekkantti C Sci Rep. 2021; 11(1):19973.

PMID: 34620985 PMC: 8497517. DOI: 10.1038/s41598-021-99554-9.

A novel deep learning approach to extract Chinese clinical entities for lung cancer screening and staging.

Zhang H, Hu D, Duan H, Li S, Wu N, Lu X BMC Med Inform Decis Mak. 2021; 21(Suppl 2):214.

PMID: 34330277 PMC: 8323233. DOI: 10.1186/s12911-021-01575-x.

Are synthetic clinical notes useful for real natural language processing tasks: A case study on clinical entity recognition.

Li J, Zhou Y, Jiang X, Natarajan K, Pakhomov S, Liu H J Am Med Inform Assoc. 2021; 28(10):2193-2201.

PMID: 34272955 PMC: 8449609. DOI: 10.1093/jamia/ocab112.

References

Yancy C, Jessup M, Bozkurt B, Butler J, Casey Jr D, Drazner M . 2013 ACCF/AHA guideline for the management of heart failure: a report of the American College of Cardiology Foundation/American Heart Association Task Force on practice guidelines. Circulation. 2013; 128(16):e240-327. DOI: 10.1161/CIR.0b013e31829e8776. View

Chiang J, Lin J, Yang C . Automated evaluation of electronic discharge notes to assess quality of care for cardiovascular diseases using Medical Language Extraction and Encoding System (MedLEE). J Am Med Inform Assoc. 2010; 17(3):245-52. PMC: 2995708. DOI: 10.1136/jamia.2009.000182. View

Matheny M, FitzHenry F, Speroff T, Green J, Griffith M, Vasilevskis E . Detection of infectious symptoms from VA emergency department and primary care clinical documentation. Int J Med Inform. 2012; 81(3):143-56. DOI: 10.1016/j.ijmedinf.2011.11.005. View

Neveol A, Islamaj Dogan R, Lu Z . Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction. J Biomed Inform. 2010; 44(2):310-8. PMC: 3063330. DOI: 10.1016/j.jbi.2010.11.001. View

Gobbel G, Reeves R, Jayaramaraja S, Giuse D, Speroff T, Brown S . Development and evaluation of RapTAT: a machine learning system for concept mapping of phrases from medical narratives. J Biomed Inform. 2013; 48:54-65. DOI: 10.1016/j.jbi.2013.11.008. View

Chen Y, Mani S, Xu H . Applying active learning to assertion classification of concepts in clinical text. J Biomed Inform. 2011; 45(2):265-72. PMC: 3306548. DOI: 10.1016/j.jbi.2011.11.003. View

Murff H, FitzHenry F, Matheny M, Gentry N, Kotter K, Crimin K . Automated identification of postoperative complications within an electronic medical record using natural language processing. JAMA. 2011; 306(8):848-55. DOI: 10.1001/jama.2011.1204. View

Lingren T, Deleger L, Molnar K, Zhai H, Meinzen-Derr J, Kaiser M . Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements. J Am Med Inform Assoc. 2013; 21(3):406-13. PMC: 3994857. DOI: 10.1136/amiajnl-2013-001837. View

Bonow R, Bennett S, Casey Jr D, Ganiats T, Hlatky M, Konstam M . ACC/AHA Clinical Performance Measures for Adults with Chronic Heart Failure: a report of the American College of Cardiology/American Heart Association Task Force on Performance Measures (Writing Committee to Develop Heart Failure Clinical.... Circulation. 2005; 112(12):1853-87. DOI: 10.1161/CIRCULATIONAHA.105.170072. View

10.

Chapman W, Nadkarni P, Hirschman L, DAvolio L, Savova G, Uzuner O . Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. J Am Med Inform Assoc. 2011; 18(5):540-3. PMC: 3168329. DOI: 10.1136/amiajnl-2011-000465. View

11.

Harkema H, Chapman W, Saul M, Dellon E, Schoen R, Mehrotra A . Developing a natural language processing application for measuring the quality of colonoscopy procedures. J Am Med Inform Assoc. 2011; 18 Suppl 1:i150-6. PMC: 3241178. DOI: 10.1136/amiajnl-2011-000431. View

12.

Greenberg J, Vakharia N, Szent-Gyorgyi L, Desai S, Turchin A, Forman J . Meaningful measurement: developing a measurement system to improve blood pressure control in patients with chronic kidney disease. J Am Med Inform Assoc. 2013; 20(e1):e97-e101. PMC: 3715343. DOI: 10.1136/amiajnl-2012-001308. View

13.

Roberts A, Gaizauskas R, Hepple M, Demetriou G, Guo Y, Roberts I . Building a semantically annotated corpus of clinical texts. J Biomed Inform. 2009; 42(5):950-66. DOI: 10.1016/j.jbi.2008.12.013. View

14.

Juckett D . A method for determining the number of documents needed for a gold standard corpus. J Biomed Inform. 2012; 45(3):460-70. DOI: 10.1016/j.jbi.2011.12.010. View

15.

Aberdeen J, Bayer S, Yeniterzi R, Wellner B, Clark C, Hanauer D . The MITRE Identification Scrubber Toolkit: design, training, and assessment. Int J Med Inform. 2010; 79(12):849-59. DOI: 10.1016/j.ijmedinf.2010.09.007. View