Ten Tips for a Text-mining-ready Article: How to Improve Automated Discoverability and Interpretability
Affiliations
Data-driven research in biomedical science requires structured, computable data. Increasingly, these data are created with support from automated text mining. Text-mining tools have rapidly matured: although not perfect, they now frequently provide outstanding results. We describe 10 straightforward writing tips-and a web tool, PubReCheck-guiding authors to help address the most common cases that remain difficult for text-mining tools. We anticipate these guides will help authors' work be found more readily and used more widely, ultimately increasing the impact of their work and the overall benefit to both authors and readers. PubReCheck is available at http://www.ncbi.nlm.nih.gov/research/pubrecheck.
Leaman R, Islamaj R, Adams V, Alliheedi M, Almeida J, Antunes R Database (Oxford). 2023; 2023.
PMID: 36882099 PMC: 9991492. DOI: 10.1093/database/baad005.
Papadimitriou S, Gravel B, Nachtegael C, De Baere E, Loeys B, Vikkula M HGG Adv. 2022; 4(1):100165.
PMID: 36578772 PMC: 9791921. DOI: 10.1016/j.xhgg.2022.100165.
Comprehensively identifying Long Covid articles with human-in-the-loop machine learning.
Leaman R, Islamaj R, Allot A, Chen Q, Wilbur W, Lu Z Patterns (N Y). 2022; 4(1):100659.
PMID: 36471749 PMC: 9712067. DOI: 10.1016/j.patter.2022.100659.
New reasons for biologists to write with a formal language.
Rodriguez-Esteban R Database (Oxford). 2022; 2022.
PMID: 35657112 PMC: 9216469. DOI: 10.1093/database/baac039.
A Second Look at FAIR in Proteomic Investigations.
Caufield J, Fu J, Wang D, Guevara-Gonzalez V, Wang W, Ping P J Proteome Res. 2021; 20(5):2182-2186.
PMID: 33719446 PMC: 8518219. DOI: 10.1021/acs.jproteome.1c00177.