» Articles » PMID: 33279995

Text Mining Approaches for Dealing with the Rapidly Expanding Literature on COVID-19

Overview
Journal Brief Bioinform
Specialty Biology
Date 2020 Dec 6
PMID 33279995
Citations 29
Authors
Affiliations
Soon will be listed here.
Abstract

More than 50 000 papers have been published about COVID-19 since the beginning of 2020 and several hundred new papers continue to be published every day. This incredible rate of scientific productivity leads to information overload, making it difficult for researchers, clinicians and public health officials to keep up with the latest findings. Automated text mining techniques for searching, reading and summarizing papers are helpful for addressing information overload. In this review, we describe the many resources that have been introduced to support text mining applications over the COVID-19 literature; specifically, we discuss the corpora, modeling resources, systems and shared tasks that have been introduced for COVID-19. We compile a list of 39 systems that provide functionality such as search, discovery, visualization and summarization over the COVID-19 literature. For each system, we provide a qualitative description and assessment of the system's performance, unique data or user interface features and modeling decisions. Many systems focus on search and discovery, though several systems provide novel features, such as the ability to summarize findings over multiple documents or linking between scientific articles and clinical trials. We also describe the public corpora, models and shared tasks that have been introduced to help reduce repeated effort among community members; some of these resources (especially shared tasks) can provide a basis for comparing the performance of different systems. Finally, we summarize promising results and open challenges for text mining the COVID-19 literature.

Citing Articles

Artificial Intelligence in Medical Affairs: A New Paradigm with Novel Opportunities.

Froling E, Rajaeean N, Hinrichsmeyer K, Domros-Zoungrana D, Urban J, Lenz C Pharmaceut Med. 2024; 38(5):331-342.

PMID: 39259426 PMC: 11473552. DOI: 10.1007/s40290-024-00536-9.


Global Research on Pandemics or Epidemics and Mental Health: A Natural Language Processing Study.

Ye X, Wang X, Lin H J Epidemiol Glob Health. 2024; 14(3):1268-1280.

PMID: 39117794 PMC: 11442711. DOI: 10.1007/s44197-024-00284-8.


PheSeq, a Bayesian deep learning model to enhance and interpret the gene-disease association studies.

Yao X, Ouyang S, Lian Y, Peng Q, Zhou X, Huang F Genome Med. 2024; 16(1):56.

PMID: 38627848 PMC: 11020195. DOI: 10.1186/s13073-024-01330-7.


The SAFE procedure: a practical stopping heuristic for active learning-based screening in systematic reviews and meta-analyses.

Boetje J, van de Schoot R Syst Rev. 2024; 13(1):81.

PMID: 38429798 PMC: 10908130. DOI: 10.1186/s13643-024-02502-7.


Using Social Media to Help Understand Patient-Reported Health Outcomes of Post-COVID-19 Condition: Natural Language Processing Approach.

Dolatabadi E, Moyano D, Bales M, Spasojevic S, Bhambhoria R, Bhatti J J Med Internet Res. 2023; 25:e45767.

PMID: 37725432 PMC: 10510753. DOI: 10.2196/45767.


References
1.
Starr M, Chalmers I, Clarke M, Oxman A . The origins, evolution, and future of The Cochrane Database of Systematic Reviews. Int J Technol Assess Health Care. 2009; 25 Suppl 1:182-95. DOI: 10.1017/S026646230909062X. View

2.
Kiritchenko S, De Bruijn B, Carini S, Martin J, Sim I . ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Med Inform Decis Mak. 2010; 10:56. PMC: 2954855. DOI: 10.1186/1472-6947-10-56. View

3.
Weston L, Tshitoyan V, Dagdelen J, Kononova O, Trewartha A, Persson K . Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature. J Chem Inf Model. 2019; 59(9):3692-3702. DOI: 10.1021/acs.jcim.9b00470. View

4.
Khan K, Kunz R, Kleijnen J, Antes G . Five steps to conducting a systematic review. J R Soc Med. 2003; 96(3):118-21. PMC: 539417. DOI: 10.1177/014107680309600304. View

5.
Sadegh S, Matschinske J, Blumenthal D, Galindez G, Kacprowski T, List M . Exploring the SARS-CoV-2 virus-host-drug interactome for drug repurposing. Nat Commun. 2020; 11(1):3518. PMC: 7360763. DOI: 10.1038/s41467-020-17189-2. View