Text Classification Models for Assessing the Completeness of Randomized Controlled Trial Publications Based on CONSORT Reporting Guidelines

Overview

Journal Sci Rep

Specialty Science

Date 2024 Sep 17

PMID 39289403

Authors

Lan Jiang

Mengfei Lan

Joe D Menke

Colby J Vorland

Halil Kilicoglu

Affiliations

Soon will be listed here.

Abstract

Complete and transparent reporting of randomized controlled trial publications (RCTs) is essential for assessing their credibility. We aimed to develop text classification models for determining whether RCT publications report CONSORT checklist items. Using a corpus annotated with 37 fine-grained CONSORT items, we trained sentence classification models (PubMedBERT fine-tuning, BioGPT fine-tuning, and in-context learning with GPT-4) and compared their performance. We assessed the impact of data augmentation methods (Easy Data Augmentation (EDA), UMLS-EDA, text generation and rephrasing with GPT-4) on model performance. We also fine-tuned section-specific PubMedBERT models (e.g., Methods) to evaluate whether they could improve performance compared to the single full model. We performed 5-fold cross-validation and report precision, recall, F score, and area under curve (AUC). Fine-tuned PubMedBERT model that uses the sentence along with the surrounding sentences and section headers yielded the best overall performance (sentence level: 0.71 micro-F, 0.67 macro-F; article-level: 0.90 micro-F, 0.84 macro-F). Data augmentation had limited positive effect. BioGPT fine-tuning and GPT-4 in-context learning exhibited suboptimal results. Methods-specific model improved recognition of methodology items, other section-specific models did not have significant impact. Most CONSORT checklist items can be recognized reasonably well with the fine-tuned PubMedBERT model but there is room for improvement. Improved models can underpin the journal editorial workflows and CONSORT adherence checks.

Citing Articles

SPIRIT-CONSORT-TM: a corpus for assessing transparency of clinical trial protocol and results publications.

Jiang L, Vorland C, Ying X, Brown A, Menke J, Hong G Sci Data. 2025; 12(1):355.

PMID: 40021657 PMC: 11871027. DOI: 10.1038/s41597-025-04629-1.

SPIRIT-CONSORT-TM: a corpus for assessing transparency of clinical trial protocol and results publications.

Jiang L, Vorland C, Ying X, Brown A, Menke J, Hong G medRxiv. 2025; .

PMID: 39867389 PMC: 11759256. DOI: 10.1101/2025.01.14.25320543.

References

Brockmeier A, Ju M, Przybyla P, Ananiadou S . Improving reference prioritisation with PICO recognition. BMC Med Inform Decis Mak. 2019; 19(1):256. PMC: 6896258. DOI: 10.1186/s12911-019-0992-8. View

Turner L, Shamseer L, Altman D, Weeks L, Peters J, Kober T . Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals. Cochrane Database Syst Rev. 2012; 11:MR000030. PMC: 7386818. DOI: 10.1002/14651858.MR000030.pub2. View

Landis S, Amara S, Asadullah K, Austin C, Blumenstein R, Bradley E . A call for transparent reporting to optimize the predictive value of preclinical research. Nature. 2012; 490(7419):187-91. PMC: 3511845. DOI: 10.1038/nature11556. View

Bodenreider O . The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2003; 32(Database issue):D267-70. PMC: 308795. DOI: 10.1093/nar/gkh061. View

Simera I, Moher D, Hirst A, Hoey J, Schulz K, Altman D . Transparent and accurate reporting increases reliability, utility, and impact of your research: reporting guidelines and the EQUATOR Network. BMC Med. 2010; 8:24. PMC: 2874506. DOI: 10.1186/1741-7015-8-24. View

Marshall I, Nye B, Kuiper J, Noel-Storr A, Marshall R, Maclean R . Trialstreamer: A living, automatically updated database of clinical trial reports. J Am Med Inform Assoc. 2020; 27(12):1903-1912. PMC: 7727361. DOI: 10.1093/jamia/ocaa163. View

Kilicoglu H . Biomedical text mining for research rigor and integrity: tasks, challenges, directions. Brief Bioinform. 2017; 19(6):1400-1414. PMC: 6291799. DOI: 10.1093/bib/bbx057. View

Hsu W, Speier W, Taira R . Automated extraction of reported statistical analyses: towards a logical representation of clinical trial literature. AMIA Annu Symp Proc. 2013; 2012:350-9. PMC: 3540551. View

Chalmers I, Glasziou P . Avoidable waste in the production and reporting of research evidence. Lancet. 2009; 374(9683):86-9. DOI: 10.1016/S0140-6736(09)60329-9. View

10.

Millard L, Flach P, Higgins J . Machine learning to assist risk-of-bias assessments in systematic reviews. Int J Epidemiol. 2015; 45(1):266-77. PMC: 4795562. DOI: 10.1093/ije/dyv306. View

11.

Weissgerber T, Riedel N, Kilicoglu H, Labbe C, Eckmann P, Ter Riet G . Automated screening of COVID-19 preprints: can we help authors to improve transparency and reproducibility?. Nat Med. 2021; 27(1):6-7. PMC: 8177099. DOI: 10.1038/s41591-020-01203-7. View

12.

Chan A, Tetzlaff J, Altman D, Laupacis A, Gotzsche P, Krleza-Jeric K . SPIRIT 2013 statement: defining standard protocol items for clinical trials. Ann Intern Med. 2013; 158(3):200-7. PMC: 5114123. DOI: 10.7326/0003-4819-158-3-201302050-00583. View

13.

Kilicoglu H, Rosemblat G, Malicki M, Ter Riet G . Automatic recognition of self-acknowledged limitations in clinical research literature. J Am Med Inform Assoc. 2018; 25(7):855-861. PMC: 6016608. DOI: 10.1093/jamia/ocy038. View

14.

Kilicoglu H, Jiang L, Hoang L, Mayo-Wilson E, Vinkers C, Otte W . Methodology reporting improved over time in 176,469 randomized controlled trials. J Clin Epidemiol. 2023; 162:19-28. PMC: 10829891. DOI: 10.1016/j.jclinepi.2023.08.004. View

15.

Iqbal S, Wallach J, Khoury M, Schully S, Ioannidis J . Reproducible Research Practices and Transparency across the Biomedical Literature. PLoS Biol. 2016; 14(1):e1002333. PMC: 4699702. DOI: 10.1371/journal.pbio.1002333. View

16.

Tian S, Jin Q, Yeganova L, Lai P, Zhu Q, Chen X . Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Brief Bioinform. 2024; 25(1). PMC: 10762511. DOI: 10.1093/bib/bbad493. View

17.

Hassanzadeh H, Groza T, Hunter J . Identifying scientific artefacts in biomedical literature: the Evidence Based Medicine use case. J Biomed Inform. 2014; 49:159-70. DOI: 10.1016/j.jbi.2014.02.006. View

18.

Jin D, Szolovits P . Advancing PICO element detection in biomedical text via deep neural networks. Bioinformatics. 2020; 36(12):3856-3862. DOI: 10.1093/bioinformatics/btaa256. View

19.

Luo R, Sun L, Xia Y, Qin T, Zhang S, Poon H . BioGPT: generative pre-trained transformer for biomedical text generation and mining. Brief Bioinform. 2022; 23(6). DOI: 10.1093/bib/bbac409. View

20.

Nov O, Singh N, Mann D . Putting ChatGPT's Medical Advice to the (Turing) Test: Survey Study. JMIR Med Educ. 2023; 9:e46939. PMC: 10366957. DOI: 10.2196/46939. View