» Articles » PMID: 30576492

Accelerating Annotation of Articles Via Automated Approaches: Evaluation of the NeXtA5 Curation-support Tool by NeXtProt

Overview
Specialty Biology
Date 2018 Dec 22
PMID 30576492
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

The development of efficient text-mining tools promises to boost the curation workflow by significantly reducing the time needed to process the literature into biological databases. We have developed a curation support tool, neXtA5, that provides a search engine coupled with an annotation system directly integrated into a biocuration workflow. neXtA5 assists curation with modules optimized for the thevarious curation tasks: document triage, entity recognition and information extraction.Here, we describe the evaluation of neXtA5 by expert curators. We first assessed the annotations of two independent curators to provide a baseline for comparison. To evaluate the performance of neXtA5, we submitted requests and compared the neXtA5 results with the manual curation. The analysis focuses on the usability of neXtA5 to support the curation of two types of data: biological processes (BPs) and diseases (Ds). We evaluated the relevance of the papers proposed as well as the recall and precision of the suggested annotations.The evaluation of document triage by neXtA5 precision showed that both curators agree with neXtA5 for 67 (BP) and 63% (D) of abstracts, while curators agree on accepting or rejecting an abstract ~80% of the time. Hence, the precision of the triage system is satisfactory.For concept extraction, curators approved 35 (BP) and 25% (D) of the neXtA5 annotations. Conversely, neXtA5 successfully annotated up to 36 (BP) and 68% (D) of the terms identified by curators. The user feedback obtained in these tests highlighted the need for improvement in the ranking function of neXtA5 annotations. Therefore, we transformed the information extraction component into an annotation ranking system. This improvement results in a top precision (precision at first rank) of 59 (D) and 63% (BP). These results suggest that when considering only the first extracted entity, the current system achieves a precision comparable with expert biocurators.

Citing Articles

New approaches in developing medicinal herbs databases.

Fathifar Z, Kalankesh L, Ostadrahimi A, Ferdousi R Database (Oxford). 2023; 2023.

PMID: 36625159 PMC: 9830469. DOI: 10.1093/database/baac110.


COVoc and COVTriage: novel resources to support literature triage.

Caucheteur D, Pendlington Z, Roncaglia P, Gobeill J, Mottin L, Matentzoglu N Bioinformatics. 2022; 39(1).

PMID: 36511598 PMC: 9825781. DOI: 10.1093/bioinformatics/btac800.


Identifying Opportunities for Workflow Automation in Health Care: Lessons Learned from Other Industries.

Zayas-Caban T, Haque S, Kemper N Appl Clin Inform. 2021; 12(3):686-697.

PMID: 34320683 PMC: 8318703. DOI: 10.1055/s-0041-1731744.


The Minimum Information about a Molecular Interaction CAusal STatement (MI2CAST).

Toure V, Vercruysse S, Acencio M, Lovering R, Orchard S, Bradley G Bioinformatics. 2020; 36(24):5712-5718.

PMID: 32637990 PMC: 8023674. DOI: 10.1093/bioinformatics/btaa622.


SIB Literature Services: RESTful customizable search engines in biomedical literature, enriched with automatically mapped biomedical concepts.

Gobeill J, Caucheteur D, Michel P, Mottin L, Pasche E, Ruch P Nucleic Acids Res. 2020; 48(W1):W12-W16.

PMID: 32379317 PMC: 7319474. DOI: 10.1093/nar/gkaa328.


References
1.
Venkatesan A, Kim J, Talo F, Ide-Smith M, Gobeill J, Carter J . SciLite: a platform for displaying text-mined annotations as a means to link research articles with biological data. Wellcome Open Res. 2017; 1:25. PMC: 5527546. DOI: 10.12688/wellcomeopenres.10210.2. View

2.
Qin J, Jiang Z, Qian Y, Casanova J, Li X . IRAK4 kinase activity is redundant for interleukin-1 (IL-1) receptor-associated kinase phosphorylation and IL-1 responsiveness. J Biol Chem. 2004; 279(25):26748-53. DOI: 10.1074/jbc.M400785200. View

3.
Muller H, Van Auken K, Li Y, Sternberg P . Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature. BMC Bioinformatics. 2018; 19(1):94. PMC: 5845379. DOI: 10.1186/s12859-018-2103-8. View

4.
Gama-Castro S, Rinaldi F, Lopez-Fuentes A, Balderas-Martinez Y, Clematide S, Ellendorff T . Assisted curation of regulatory interactions and growth conditions of OxyR in E. coli K-12. Database (Oxford). 2014; 2014. PMC: 4207228. DOI: 10.1093/database/bau049. View

5.
Lee K, Famiglietti M, McMahon A, Wei C, MacArthur J, Poux S . Scaling up data curation using deep learning: An application to literature triage in genomic variation resources. PLoS Comput Biol. 2018; 14(8):e1006390. PMC: 6107285. DOI: 10.1371/journal.pcbi.1006390. View